date:20191122

[Bug 1746943] Re: qemu-aarch64-static: qemu: uncaught target signal 11 for ps/top cmd

2019-11-22 Thread Launchpad Bug Tracker

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1746943

Title:
  qemu-aarch64-static: qemu: uncaught target signal 11 for ps/top cmd

Status in QEMU:
  Expired

Bug description:
  In a docker container created from an aarch64 image, injects 
qemu-aarch64-static (in /usr/bin)
run ps/top cmd  inside this container

reports "qemu: uncaught target signal 11 (Segmentation fault)"

  Tried qemu-aarch64-static from fedora 27 / ubuntu artful / debian
  unstable (i.e. qemu version 2.10 - 2.11)

  The aarch64 dock image is fedora 27(and with qemu-aarch64-static
  Fedora 27), hence I opened a related bug here
  https://bugzilla.redhat.com/show_bug.cgi?id=1541252

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1746943/+subscriptions

[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2019-11-22 Thread yuchenlin

I have sent a patch about this problem.

Please see https://lists.nongnu.org/archive/html/qemu-
devel/2019-11/msg03924.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  New

Bug description:
  When running a machine using "-vnc" and the "websocket" option QEMU
  seems to require the subprotocol called 'binary'. This subprotocol
  does not exist in the WebSocket specification. In fact it has never
  existed in the spec, in one of the very early drafts of WebSockets it
  was briefly mentioned but it never made it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions

[PATCH] io/channel-websock: treat 'binary' and no sub-protocol as the same

2019-11-22 Thread Yu-Chen Lin

noVNC doesn't use 'binary' protocol by default after
commit c912230309806aacbae4295faf7ad6406da97617.

It will cause qemu return 400 when handshaking.

To overcome this problem and remain compatibility of
older noVNC client.

We treat 'binary' and no sub-protocol as the same
so that we can support different version of noVNC
client.

Tested on noVNC before c912230 and after c912230.

Buglink: https://bugs.launchpad.net/qemu/+bug/1849644

Signed-off-by: Yu-Chen Lin 
---
 io/channel-websock.c | 35 +++
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/io/channel-websock.c b/io/channel-websock.c
index fc36d44eba..918e09ea3f 100644
--- a/io/channel-websock.c
+++ b/io/channel-websock.c
@@ -49,13 +49,20 @@
 "Server: QEMU VNC\r\n"   \
 "Date: %s\r\n"
 
+#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_WITH_PROTO_RES_OK \
+"HTTP/1.1 101 Switching Protocols\r\n"  \
+QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON\
+"Upgrade: websocket\r\n"\
+"Connection: Upgrade\r\n"   \
+"Sec-WebSocket-Accept: %s\r\n"  \
+"Sec-WebSocket-Protocol: binary\r\n"\
+"\r\n"
 #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_OK\
 "HTTP/1.1 101 Switching Protocols\r\n"  \
 QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON\
 "Upgrade: websocket\r\n"\
 "Connection: Upgrade\r\n"   \
 "Sec-WebSocket-Accept: %s\r\n"  \
-"Sec-WebSocket-Protocol: binary\r\n"\
 "\r\n"
 #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_NOT_FOUND \
 "HTTP/1.1 404 Not Found\r\n"\
@@ -336,6 +343,7 @@ qio_channel_websock_find_header(QIOChannelWebsockHTTPHeader 
*hdrs,
 
 static void qio_channel_websock_handshake_send_res_ok(QIOChannelWebsock *ioc,
   const char *key,
+  const bool use_protocols,
   Error **errp)
 {
 char combined_key[QIO_CHANNEL_WEBSOCK_CLIENT_KEY_LEN +
@@ -361,8 +369,13 @@ static void 
qio_channel_websock_handshake_send_res_ok(QIOChannelWebsock *ioc,
 }
 
 date = qio_channel_websock_date_str();
-qio_channel_websock_handshake_send_res(
-ioc, QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_OK, date, accept);
+if (use_protocols) {
+qio_channel_websock_handshake_send_res(
+ioc, QIO_CHANNEL_WEBSOCK_HANDSHAKE_WITH_PROTO_RES_OK, date, 
accept);
+} else {
+qio_channel_websock_handshake_send_res(
+ioc, QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_OK, date, accept);
+}
 
 g_free(date);
 g_free(accept);
@@ -387,10 +400,6 @@ static void 
qio_channel_websock_handshake_process(QIOChannelWebsock *ioc,
 
 protocols = qio_channel_websock_find_header(
 hdrs, nhdrs, QIO_CHANNEL_WEBSOCK_HEADER_PROTOCOL);
-if (!protocols) {
-error_setg(errp, "Missing websocket protocol header data");
-goto bad_request;
-}
 
 version = qio_channel_websock_find_header(
 hdrs, nhdrs, QIO_CHANNEL_WEBSOCK_HEADER_VERSION);
@@ -430,10 +439,12 @@ static void 
qio_channel_websock_handshake_process(QIOChannelWebsock *ioc,
 trace_qio_channel_websock_http_request(ioc, protocols, version,
host, connection, upgrade, key);
 
-if (!g_strrstr(protocols, QIO_CHANNEL_WEBSOCK_PROTOCOL_BINARY)) {
-error_setg(errp, "No '%s' protocol is supported by client '%s'",
-   QIO_CHANNEL_WEBSOCK_PROTOCOL_BINARY, protocols);
-goto bad_request;
+if (protocols) {
+if (!g_strrstr(protocols, QIO_CHANNEL_WEBSOCK_PROTOCOL_BINARY)) {
+error_setg(errp, "No '%s' protocol is supported by client 
'%s'",
+   QIO_CHANNEL_WEBSOCK_PROTOCOL_BINARY, protocols);
+goto bad_request;
+}
 }
 
 if (!g_str_equal(version, QIO_CHANNEL_WEBSOCK_SUPPORTED_VERSION)) {
@@ -467,7 +478,7 @@ static void 
qio_channel_websock_handshake_process(QIOChannelWebsock *ioc,
 goto bad_request;
 }
 
-qio_channel_websock_handshake_send_res_ok(ioc, key, errp);
+qio_channel_websock_handshake_send_res_ok(ioc, key, !!protocols, errp);
 return;
 
  bad_request:
-- 
2.17.1

Re: [PATCH 0/6] Enable Travis builds on arm64, ppc64le and s390x

2019-11-22 Thread Alex Bennée



Thomas Huth  writes:

> Travis recently added build hosts for arm64, ppc64le and s390x, so
> this is a welcome addition to our Travis testing matrix.
>
> Unfortunately, the builds are running in quite restricted LXD containers
> there, for example it is not possible to create huge files there (even
> if they are just sparse), and certain system calls are blocked. So we
> have to change some tests first to stop them failing in such environments.

>   iotests: Skip test 060 if it is not possible to create large files
>   iotests: Skip test 079 if it is not possible to create large files

It seems like 161 is also failing:

  https://travis-ci.org/stsquad/qemu/jobs/615672478


>   tests/hd-geo-test: Skip test when images can not be created
>   tests/test-util-filemonitor: Skip test on non-x86 Travis containers
>   travis.yml: Enable builds on arm64, ppc64le and s390x
>
>  .travis.yml   | 85 ++-
>  tests/hd-geo-test.c   | 12 -
>  tests/qemu-iotests/060|  6 +++
>  tests/qemu-iotests/079|  6 +++
>  tests/test-util-filemonitor.c | 11 +
>  5 files changed, 118 insertions(+), 2 deletions(-)


-- 
Alex Bennée

Re: [PATCH v2 4/5] MAINTAINERS: Adjust maintainership for R4000 systems

2019-11-22 Thread Hervé Poussineau


Le 22/11/2019 à 16:29, Philippe Mathieu-Daudé a écrit :

On 11/22/19 3:14 PM, Aleksandar Markovic wrote:

On Fri, Nov 22, 2019 at 2:58 PM Philippe Mathieu-Daudé
 wrote:


Hi Aleksandar,

On 11/13/19 2:47 PM, Aleksandar Markovic wrote:

From: Aleksandar Markovic 

Change the maintainership for R4000 systems to improve its quality.

Acked-by: Aurelien Jarno 
Signed-off-by: Aleksandar Markovic 
---
   MAINTAINERS | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6afec32..ba9ca98 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -971,8 +971,9 @@ F: hw/mips/mips_mipssim.c
   F: hw/net/mipsnet.c

   R4000
-M: Aurelien Jarno 
-R: Aleksandar Rikalo 
+M: Hervé Poussineau 


Commit 0c10962a033 from Hervé was part of a bigger refactor series, so I
don't think he is interested.


+R: Aurelien Jarno 
+R: Philippe Mathieu-Daudé 
   S: Maintained
   F: hw/mips/mips_r4k.c


Now back to this board, I am having hard time to understand what it
models. IIUC it predates the Malta board, and was trying to model a
board able to run the first MIPS cpu when the port was added in 2005
(see commit 6af0bf9c7c3a).
The Malta board was added 1 year later (commit 5856de800df) and models a
real hardware.

As Aurelien acked to step down maintaining it, it seems the perfect
timing to start its deprecation process. I'll prepare a patch for 5.0
(unless someone is really using it and willing to maintain it).



Philippe, hi.

Herve told me a while ago that he does care about R4000 being
supported, as it is closely related to Jazz machines, so please
don't start any deprecation process.


I think what Hervé meant to say is he cares about the R4000 CPU (implementing the MIPSIII architecture). The Magnum and Pica boards indeed use a R4000 CPU. I also personally care about this CPU, and 
don't want it to disappear.


Here we are talking about the some Frankenstein board. QEMU aims to model real hardware, with the exception of the 'Virt' boards that have specifications. Here I can't find any. I am not against Hervé 
maintaining this file if he has some interest in it, but I think there are confusion and we are talking about 2 different topics.


Philippe is right.
I care about Magnum/PICA boards (which have a R4000 cpu).
I don't care about this the mips_r4k.c machine, and I think that deprecating 
mips_r4k.c machine is the right thing to do.




Herve is the most familiar of all of us with R4000, and, for that
reason, my suggestion is to keep the patch as it is. Let me know
if you have any objections.

One alternative approach would be to merge "R4000" and
"Jazz" sections. But, let's leave it for future as an option,
if nobody objects.


Jazz and mips_r4k machines have mostly nothing in common, except using a R4000 
CPU and an ISA bus.

Regards,

Hervé

Re: [PATCH] riscv: virt: Allow PCI address 0

2019-11-22 Thread Palmer Dabbelt


On Fri, 22 Nov 2019 07:27:52 PST (-0800), bmeng...@gmail.com wrote:

When testing e1000 with the virt machine, e1000's I/O space cannot
be accessed. Debugging shows that the I/O BAR (BAR1) is correctly
written with address 0 plus I/O enable bit, but QEMU's "info pci"
shows that:

  Bus  0, device   1, function 0:
Ethernet controller: PCI device 8086:100e
  ...
  BAR1: I/O at 0x [0x003e].
  ...

It turns out we should set pci_allow_0_address to true to allow 0
PCI address, otherwise pci_bar_address() treats such address as
PCI_BAR_UNMAPPED.

Signed-off-by: Bin Meng 
---

 hw/riscv/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 23f340d..411bef5 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -603,6 +603,7 @@ static void riscv_virt_machine_class_init(ObjectClass *oc, 
void *data)
 mc->init = riscv_virt_board_init;
 mc->max_cpus = 8;
 mc->default_cpu_type = VIRT_CPU;
+mc->pci_allow_0_address = true;
 }

 static const TypeInfo riscv_virt_machine_typeinfo = {


Reviewed-by: Palmer Dabbelt 

I've put this on for-next, as I don't think this is 4.2 material.

Thanks!

Re: [PATCH for-5.0 v11 00/20] VIRTIO-IOMMU device

2019-11-22 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20191122182943.4656-1-eric.au...@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  x86_64-softmmu/hw/virtio/virtio-scsi-pci.o
  CC  x86_64-softmmu/hw/virtio/virtio-blk-pci.o
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c: In function 'int_cmp':
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:697:5: error: unknown type name 
'uint'; did you mean 'guint'?
 uint ua = GPOINTER_TO_UINT(a);
 ^~~~
 guint
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:698:5: error: unknown type name 
'uint'; did you mean 'guint'?
 uint ub = GPOINTER_TO_UINT(b);
 ^~~~
 guint
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/virtio/virtio-iommu.o] Error 1
make[1]: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/accel/tcg/tcg-runtime.o
  CC  aarch64-softmmu/accel/tcg/tcg-runtime-gvec.o
---
  CC  aarch64-softmmu/hw/block/dataplane/virtio-blk.o
  CC  aarch64-softmmu/hw/char/exynos4210_uart.o
  CC  aarch64-softmmu/hw/char/omap_uart.o
make: *** [Makefile:491: x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/hw/char/digic-uart.o
  CC  aarch64-softmmu/hw/char/stm32f2xx_usart.o
---
  CC  aarch64-softmmu/hw/arm/boot.o
  CC  aarch64-softmmu/hw/arm/sysbus-fdt.o
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c: In function 'int_cmp':
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:697:5: error: unknown type name 
'uint'; did you mean 'guint'?
 uint ua = GPOINTER_TO_UINT(a);
 ^~~~
 guint
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:698:5: error: unknown type name 
'uint'; did you mean 'guint'?
 uint ub = GPOINTER_TO_UINT(b);
 ^~~~
 guint
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/virtio/virtio-iommu.o] Error 1
make[1]: *** Waiting for unfinished jobs
make: *** [Makefile:491: aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in 
sys.exit(main())
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=028df3325d7c4927a1c334040016728f', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-cuthbude/src/docker-src.2019-11-22-16.49.52.5210:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=028df3325d7c4927a1c334040016728f
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-cuthbude/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real6m21.012s
user0m8.319s


The full log is available at
http://patchew.org/logs/20191122182943.4656-1-eric.au...@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH] 9pfs: Fix divide by zero bug

2019-11-22 Thread Christian Schoenebeck

On Freitag, 22. November 2019 21:00:34 CET Dan Schatzberg wrote:
> Some filesystems may return 0s in statfs (trivially, a FUSE filesystem
> can do so). QEMU should handle this gracefully and just behave the
> same as if statfs failed.

Is that actually legal in non-error cases? Shouldn't a driver without a block 
size concept return 512 according to POSIX?

> Signed-off-by: Dan Schatzberg 
> ---
>  hw/9pfs/9p.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 37abcdb71e..520177f40c 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1834,8 +1834,10 @@ static int32_t coroutine_fn get_iounit(V9fsPDU *pdu,
> V9fsPath *path) * and as well as less than (client msize - P9_IOHDRSZ))
>   */
>  if (!v9fs_co_statfs(pdu, path, )) {
> -iounit = stbuf.f_bsize;
> -iounit *= (s->msize - P9_IOHDRSZ)/stbuf.f_bsize;
> +if (stbuf.f_bsize) {
> +iounit = stbuf.f_bsize;
> +iounit *= (s->msize - P9_IOHDRSZ) / stbuf.f_bsize;
> +}
>  }
>  if (!iounit) {
>  iounit = s->msize - P9_IOHDRSZ;

Nevertheless, since that will leave iounit initialized with zero and since 
there is already an !ionunit case handling there ...

Acked-by: Christian Schoenebeck 

Best regards,
Christian Schoenebeck

Re: [PATCH] linux-user: fix translation of statx structures

2019-11-22 Thread Ariadne Conill

Hello,

On Fri, Nov 22, 2019 at 12:27 PM Aleksandar Markovic
 wrote:
>
> On Fri, Nov 22, 2019 at 7:22 PM Ariadne Conill  
> wrote:
> >
> > All timestamps were copied to atime instead of to their respective
> > fields.
> >
> > Signed-off-by: Ariadne Conill 
> > ---
>
> What a bug.

Yes, in Alpine qemu+binfmt_misc+chroot environment, this bug caused
all files to have an observed mtime set to the UNIX epoch, which
caused problems with building Autoconf-based applications.  This
really irked me so I took the time to dig into it.

>
> Laurent, perhaps a good candidate for 4.2?
>
> Thanks for submitting this, Ariadne Conill!

Not a problem.

Ariadne

[PATCH] 9pfs: Fix divide by zero bug

2019-11-22 Thread Dan Schatzberg

Some filesystems may return 0s in statfs (trivially, a FUSE filesystem
can do so). QEMU should handle this gracefully and just behave the
same as if statfs failed.

Signed-off-by: Dan Schatzberg 
---
 hw/9pfs/9p.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 37abcdb71e..520177f40c 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1834,8 +1834,10 @@ static int32_t coroutine_fn get_iounit(V9fsPDU *pdu, 
V9fsPath *path)
  * and as well as less than (client msize - P9_IOHDRSZ))
  */
 if (!v9fs_co_statfs(pdu, path, )) {
-iounit = stbuf.f_bsize;
-iounit *= (s->msize - P9_IOHDRSZ)/stbuf.f_bsize;
+if (stbuf.f_bsize) {
+iounit = stbuf.f_bsize;
+iounit *= (s->msize - P9_IOHDRSZ) / stbuf.f_bsize;
+}
 }
 if (!iounit) {
 iounit = s->msize - P9_IOHDRSZ;
-- 
2.17.1

Re: [PATCH v2 5/5] MAINTAINERS: Add two files to Malta section

2019-11-22 Thread Aleksandar Markovic

On Fri, Nov 22, 2019 at 9:28 PM Philippe Mathieu-Daudé
 wrote:
>
> On 11/13/19 2:47 PM, Aleksandar Markovic wrote:
> > From: Aleksandar Markovic 
> >
> > Add two files that were recently introduced in a refactoring,
> > that Malta emulation relies on. They are added by this patch
> > to Malta section, but they are not added to the general MIPS
> > section, since they are really not MIPS-specific, and there
> > may be some non-MIPS hardware using them in future.
> >
> > Signed-off-by: Aleksandar Markovic 
> > ---
> >   MAINTAINERS | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index ba9ca98..f8a1646 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -959,8 +959,10 @@ M: Philippe Mathieu-Daudé 
> >   R: Hervé Poussineau 
> >   R: Aurelien Jarno 
> >   S: Maintained
> > +F: hw/isa/piix4.c
>
> Maybe:
>
> F: hw/*/piix4.c
>
> Or also add:
>
> F: hw/acpi/piix4.c
>

Philippe, are you sure? hw/acpi/piix4.c is used in PC, not in Malta, no?

> >   F: hw/mips/mips_malta.c
> >   F: hw/mips/gt64xxx_pci.c
> > +F: include/hw/southbridge/piix.h
> >   F: tests/acceptance/linux_ssh_mips_malta.py
> >
> >   Mipssim
> >
>
> Reviewed-by: Philippe Mathieu-Daudé 
>
>

Re: [PATCH] linux-user: fix translation of statx structures

2019-11-22 Thread Laurent Vivier

Le 22/11/2019 à 18:40, Ariadne Conill a écrit :
> All timestamps were copied to atime instead of to their respective
> fields.
> 
> Signed-off-by: Ariadne Conill 
> ---
>  linux-user/syscall.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index ce399a55f0..171c0caef3 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -6743,12 +6743,12 @@ static inline abi_long host_to_target_statx(struct 
> target_statx *host_stx,
>  __put_user(host_stx->stx_attributes_mask, 
> _stx->stx_attributes_mask);
>  __put_user(host_stx->stx_atime.tv_sec, _stx->stx_atime.tv_sec);
>  __put_user(host_stx->stx_atime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_btime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_atime.tv_nsec);
> +__put_user(host_stx->stx_btime.tv_sec, _stx->stx_btime.tv_sec);
> +__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_btime.tv_nsec);
> +__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_ctime.tv_sec);
> +__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_ctime.tv_nsec);
> +__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_mtime.tv_sec);
> +__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_mtime.tv_nsec);
>  __put_user(host_stx->stx_rdev_major, _stx->stx_rdev_major);
>  __put_user(host_stx->stx_rdev_minor, _stx->stx_rdev_minor);
>  __put_user(host_stx->stx_dev_major, _stx->stx_dev_major);
> 

Reviewed-by: Laurent Vivier

Re: [PATCH v2 5/5] MAINTAINERS: Add two files to Malta section

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/13/19 2:47 PM, Aleksandar Markovic wrote:

From: Aleksandar Markovic 

Add two files that were recently introduced in a refactoring,
that Malta emulation relies on. They are added by this patch
to Malta section, but they are not added to the general MIPS
section, since they are really not MIPS-specific, and there
may be some non-MIPS hardware using them in future.

Signed-off-by: Aleksandar Markovic 
---
  MAINTAINERS | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ba9ca98..f8a1646 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -959,8 +959,10 @@ M: Philippe Mathieu-Daudé 
  R: Hervé Poussineau 
  R: Aurelien Jarno 
  S: Maintained
+F: hw/isa/piix4.c


Maybe:

   F: hw/*/piix4.c

Or also add:

   F: hw/acpi/piix4.c


  F: hw/mips/mips_malta.c
  F: hw/mips/gt64xxx_pci.c
+F: include/hw/southbridge/piix.h
  F: tests/acceptance/linux_ssh_mips_malta.py
  
  Mipssim




Reviewed-by: Philippe Mathieu-Daudé

Re: [RFC PATCH-for-5.0] hw/pci-host: Add Kconfig selector for IGD PCIe pass-through

2019-11-22 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20191122172201.11456-1-phi...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  LINKaarch64-softmmu/qemu-system-aarch64
hw/vfio/pci.o: In function `vfio_realize':
/tmp/qemu-test/src/hw/vfio/pci.c:2949: undefined reference to 
`vfio_pci_igd_opregion_init'
collect2: error: ld returned 1 exit status
make[1]: *** [qemu-system-x86_64] Error 1
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs
hw/vfio/pci.o: In function `vfio_realize':
/tmp/qemu-test/src/hw/vfio/pci.c:2949: undefined reference to 
`vfio_pci_igd_opregion_init'
collect2: error: ld returned 1 exit status
make[1]: *** [qemu-system-aarch64] Error 1
make: *** [aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in 
sys.exit(main())
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=57b8f6e005964f0cb505a02139414e88', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-o78ntsrk/src/docker-src.2019-11-22-15.23.40.28372:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=57b8f6e005964f0cb505a02139414e88
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-o78ntsrk/src'
make: *** [docker-run-test-quick@centos7] Error 2

real2m36.282s
user0m7.884s


The full log is available at
http://patchew.org/logs/20191122172201.11456-1-phi...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH] linux-user: fix translation of statx structures

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 7:27 PM, Aleksandar Markovic wrote:

On Fri, Nov 22, 2019 at 7:22 PM Ariadne Conill  wrote:


All timestamps were copied to atime instead of to their respective
fields.



Fixes: efa921845c0


Signed-off-by: Ariadne Conill 
---


What a bug.

Laurent, perhaps a good candidate for 4.2?


Agreed.



Thanks for submitting this, Ariadne Conill!


And welcome to QEMU :)


Reviewed-by: Aleksandar Markovic 


Reviewed-by: Philippe Mathieu-Daudé 


  linux-user/syscall.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ce399a55f0..171c0caef3 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6743,12 +6743,12 @@ static inline abi_long host_to_target_statx(struct 
target_statx *host_stx,
  __put_user(host_stx->stx_attributes_mask, 
_stx->stx_attributes_mask);
  __put_user(host_stx->stx_atime.tv_sec, _stx->stx_atime.tv_sec);
  __put_user(host_stx->stx_atime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_btime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_atime.tv_nsec);
+__put_user(host_stx->stx_btime.tv_sec, _stx->stx_btime.tv_sec);
+__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_btime.tv_nsec);
+__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_ctime.tv_sec);
+__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_ctime.tv_nsec);
+__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_mtime.tv_sec);
+__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_mtime.tv_nsec);
  __put_user(host_stx->stx_rdev_major, _stx->stx_rdev_major);
  __put_user(host_stx->stx_rdev_minor, _stx->stx_rdev_minor);
  __put_user(host_stx->stx_dev_major, _stx->stx_dev_major);
--
2.24.0

Re: [PATCH for-5.0 v11 12/20] qapi: Introduce DEFINE_PROP_INTERVAL

2019-11-22 Thread Dr. David Alan Gilbert

* Eric Auger (eric.au...@redhat.com) wrote:
> Introduce a new property defining a labelled interval:
> ,,label.
> 
> This will be used to encode reserved IOVA regions. The label
> is left undefined to ease reuse accross use cases.
> 
> For instance, in virtio-iommu use case, reserved IOVA regions
> will be passed by the machine code to the virtio-iommu-pci
> device (an array of those). The label will match the
> virtio_iommu_probe_resv_mem subtype value:
> - VIRTIO_IOMMU_RESV_MEM_T_RESERVED (0)
> - VIRTIO_IOMMU_RESV_MEM_T_MSI (1)
> 
> This is used to inform the virtio-iommu-pci device it should
> bypass the MSI region: 0xfee0, 0xfeef, 1.
> 
> Signed-off-by: Eric Auger 
> ---
>  hw/core/qdev-properties.c| 90 
>  include/exec/memory.h|  6 +++
>  include/hw/qdev-properties.h |  3 ++
>  include/qemu/typedefs.h  |  1 +
>  4 files changed, 100 insertions(+)
> 
> diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
> index ac28890e5a..8d70f34e37 100644
> --- a/hw/core/qdev-properties.c
> +++ b/hw/core/qdev-properties.c
> @@ -13,6 +13,7 @@
>  #include "qapi/visitor.h"
>  #include "chardev/char.h"
>  #include "qemu/uuid.h"
> +#include "qemu/cutils.h"
>  
>  void qdev_prop_set_after_realize(DeviceState *dev, const char *name,
>Error **errp)
> @@ -585,6 +586,95 @@ const PropertyInfo qdev_prop_macaddr = {
>  .set   = set_mac,
>  };
>  
> +/* --- Labelled Interval --- */
> +
> +/*
> + * accepted syntax versions:
> + *   ,,
> + *   where low/high addresses are uint64_t in hexa (feat. 0x prefix)
> + *   and type is an unsigned integer
> + */
> +static void get_interval(Object *obj, Visitor *v, const char *name,
> + void *opaque, Error **errp)
> +{
> +DeviceState *dev = DEVICE(obj);
> +Property *prop = opaque;
> +Interval *interval = qdev_get_prop_ptr(dev, prop);
> +char buffer[64];
> +char *p = buffer;
> +
> +snprintf(buffer, sizeof(buffer), "0x%"PRIx64",0x%"PRIx64",%d",
> + interval->low, interval->high, interval->type);
> +
> +visit_type_str(v, name, , errp);
> +}
> +
> +static void set_interval(Object *obj, Visitor *v, const char *name,
> + void *opaque, Error **errp)
> +{
> +DeviceState *dev = DEVICE(obj);
> +Property *prop = opaque;
> +Interval *interval = qdev_get_prop_ptr(dev, prop);
> +Error *local_err = NULL;
> +unsigned int type;
> +gchar **fields;
> +uint64_t addr;
> +char *str;
> +int ret;
> +
> +if (dev->realized) {
> +qdev_prop_set_after_realize(dev, name, errp);
> +return;
> +}
> +
> +visit_type_str(v, name, , _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +
> +fields = g_strsplit(str, ",", 3);
> +
> +ret = qemu_strtou64(fields[0], NULL, 16, );
> +if (!ret) {
> +interval->low = addr;
> +} else {
> +error_setg(errp, "Failed to decode interval low addr");
> +error_append_hint(errp,
> +  "should be an address in hexa with 0x prefix\n");
> +goto out;
> +}
> +
> +ret = qemu_strtou64(fields[1], NULL, 16, );
> +if (!ret) {
> +interval->high = addr;
> +} else {
> +error_setg(errp, "Failed to decode interval high addr");
> +error_append_hint(errp,
> +  "should be an address in hexa with 0x prefix\n");
> +goto out;
> +}
> +
> +ret = qemu_strtoui(fields[2], NULL, 10, );
> +if (!ret) {
> +interval->type = type;
> +} else {
> +error_setg(errp, "Failed to decode interval type");
> +error_append_hint(errp, "should be an unsigned int in decimal\n");
> +}
> +out:
> +g_free(str);
> +g_strfreev(fields);
> +return;
> +}
> +
> +const PropertyInfo qdev_prop_interval = {
> +.name  = "labelled_interval",
> +.description = "Labelled interval, example: 0xFEE0,0xFEEF,0",
> +.get   = get_interval,
> +.set   = set_interval,
> +};
> +
>  /* --- on/off/auto --- */
>  
>  const PropertyInfo qdev_prop_on_off_auto = {
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index e499dc215b..e238d1c352 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -57,6 +57,12 @@ struct MemoryRegionMmio {
>  CPUWriteMemoryFunc *write[3];
>  };
>  
> +struct Interval {
> +hwaddr low;
> +hwaddr high;
> +unsigned int type;
> +};
> +

Just an observation that 'Interval' is a very generic name.
We've got 'AddrRange' but that's Int128.

Dave

>  typedef struct IOMMUTLBEntry IOMMUTLBEntry;
>  
>  /* See address_space_translate: bit 0 is read, bit 1 is write.  */
> diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
> index c6a8cb5516..2ba7c8711b 100644
> --- a/include/hw/qdev-properties.h
> +++ b/include/hw/qdev-properties.h
> @@ -20,6 +20,7 @@ extern

Re: [RFC PATCH-for-5.0] hw/pci-host: Add Kconfig selector for IGD PCIe pass-through

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 7:03 PM, Thomas Huth wrote:

On 22/11/2019 18.22, Philippe Mathieu-Daudé wrote:

Introduce a kconfig selector to allow builds without Intel
Integrated Graphics Device GPU PCIe passthrough.
We keep the default as enabled.

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC because to be able to use the Kconfig-generated
"config-devices.h" header we have to move this device
out of $common-obj and build i440fx.o on a per-target
basis, which is not optimal...


IMHO you should move the code out of i440fx.o and into a separate file
if possible. That's hopefully cleaner than #ifdeffing here, and you
hopefully only need to move the new code into "obj-" and can keep
i440fx.o in common-obj.


Correct. I wanted to try a surgical patch first ;)

Re: [PATCH v2 0/9] RFC: [for 5.0]: HMP monitor handlers cleanups

2019-11-22 Thread Dr. David Alan Gilbert

* Maxim Levitsky (mlevi...@redhat.com) wrote:
> This patch series is bunch of cleanups
> to the hmp monitor code.
> 
> This series only touched blockdev related hmp handlers.
> 
> No functional changes expected other that
> light error message changes by the last patch.
> 
> This was inspired by this bugzilla:
> https://bugzilla.redhat.com/show_bug.cgi?id=1719169
> 
> Basically some users still parse hmp error messages,
> and they would like to have them prefixed with 'Error:'
> 
> In commit 66363e9a43f649360a3f74d2805c9f864da027eb we added
> the hmp_handle_error which does exactl that but some hmp handlers
> don't use it.
> 
> In this patch series, I moved all the block related hmp handlers
> into blockdev-hmp-cmds.c, and then made them use this function
> to report the errors.
> 
> I hope I didn't change too much code, I just felt that if
> I touch this code, I can also make it easier to find these
> handlers, that were scattered over 3 different files.
> 
> Changes from V1:
>* move the handlers to block/monitor/block-hmp-cmds.c
>* tiny cleanup for the commit messages

OK, so again, from the HMP side:

Reviewed-by: Dr. David Alan Gilbert 

> Best regards,
>   Maxim Levitsky
> 
> Maxim Levitsky (9):
>   monitor/hmp: uninline add_init_drive
>   monitor/hmp: rename device-hotplug.c to block/monitor/block-hmp-cmds.c
>   monitor/hmp: move hmp_drive_del and hmp_commit to block-hmp-cmds.c
>   monitor/hmp: move hmp_drive_mirror and hmp_drive_backup to
> block-hmp-cmds.c
>   monitor/hmp: move hmp_block_job* to block-hmp-cmds.c
>   monitor/hmp: move hmp_snapshot_* to block-hmp-cmds.c
>   monitor/hmp: move remaining hmp_block* functions to block-hmp-cmds.c
>   monitor/hmp: move hmp_info_block* to block-hmp-cmds.c
>   monitor/hmp: Prefer to use hmp_handle_error for error reporting in
> block hmp commands
> 
>  MAINTAINERS|   1 +
>  Makefile.objs  |   2 +-
>  block/Makefile.objs|   1 +
>  block/monitor/Makefile.objs|   1 +
>  block/monitor/block-hmp-cmds.c | 656 +
>  blockdev.c |  95 -
>  device-hotplug.c   |  91 -
>  monitor/hmp-cmds.c | 465 ---
>  8 files changed, 660 insertions(+), 652 deletions(-)
>  create mode 100644 block/monitor/Makefile.objs
>  create mode 100644 block/monitor/block-hmp-cmds.c
>  delete mode 100644 device-hotplug.c
> 
> -- 
> 2.17.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[PATCH for-5.0 v11 19/20] pc: Add support for virtio-iommu-pci

2019-11-22 Thread Eric Auger

The virtio-iommu-pci is instantiated through the -device QEMU
option. However if instantiated it also requires an IORT ACPI table
to describe the ID mappings between the root complex and the iommu.

This patch adds the generation of the IORT table if the
virtio-iommu-pci device is instantiated.

We also declare the [0xfee0 - 0xfeef] MSI reserved region
so that it gets bypassed by the IOMMU.

Signed-off-by: Eric Auger 
---
 hw/i386/acpi-build.c | 72 
 hw/i386/pc.c | 15 -
 include/hw/i386/pc.h |  2 ++
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 12ff55fcfb..f09cabdcae 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2744,6 +2744,72 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 return true;
 }
 
+static inline void
+fill_iort_idmap(AcpiIortIdMapping *idmap, int i,
+uint32_t input_base, uint32_t id_count,
+uint32_t output_base, uint32_t output_reference)
+{
+idmap[i].input_base = cpu_to_le32(input_base);
+idmap[i].id_count = cpu_to_le32(id_count);
+idmap[i].output_base = cpu_to_le32(output_base);
+idmap[i].output_reference = cpu_to_le32(output_reference);
+}
+
+static void
+build_iort(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
+{
+size_t iommu_node_size, rc_node_size, iommu_node_offset;
+int iort_start = table_data->len;
+AcpiIortPVIommuPCI *iommu;
+AcpiIortIdMapping *idmap;
+AcpiIortTable *iort;
+size_t iort_length;
+AcpiIortRC *rc;
+
+iort = acpi_data_push(table_data, sizeof(*iort));
+iort_length = sizeof(*iort);
+iort->node_count = cpu_to_le32(2);
+
+/* virtio-iommu node */
+
+iommu_node_offset = sizeof(*iort);
+iort->node_offset = cpu_to_le32(iommu_node_offset);
+iommu_node_size = sizeof(*iommu);
+iort_length += iommu_node_offset;
+iommu = acpi_data_push(table_data, iommu_node_size);
+iommu->type = ACPI_IORT_NODE_PARAVIRT;
+iommu->length = cpu_to_le16(iommu_node_size);
+iommu->mapping_count = 0;
+iommu->devid = cpu_to_le32(pcms->virtio_iommu_bdf);
+iommu->model = cpu_to_le32(ACPI_IORT_NODE_PV_VIRTIO_IOMMU_PCI);
+
+/* Root Complex Node */
+rc_node_size = sizeof(*rc) + 2 * sizeof(*idmap);
+iort_length += rc_node_size;
+rc = acpi_data_push(table_data, rc_node_size);
+
+rc->type = ACPI_IORT_NODE_PCI_ROOT_COMPLEX;
+rc->length = cpu_to_le16(rc_node_size);
+rc->mapping_count = cpu_to_le32(2);
+rc->mapping_offset = cpu_to_le32(sizeof(*rc));
+
+/* fully coherent device */
+rc->memory_properties.cache_coherency = cpu_to_le32(1);
+rc->memory_properties.memory_flags = 0x3; /* CCA = CPM = DCAS = 1 */
+rc->pci_segment_number = 0; /* MCFG pci_segment */
+fill_iort_idmap(rc->id_mapping_array, 0, 0, pcms->virtio_iommu_bdf, 0,
+iommu_node_offset);
+fill_iort_idmap(rc->id_mapping_array, 1, pcms->virtio_iommu_bdf + 1,
+0x - pcms->virtio_iommu_bdf,
+pcms->virtio_iommu_bdf + 1, iommu_node_offset);
+
+iort = (AcpiIortTable *)(table_data->data + iort_start);
+iort->length = cpu_to_le32(iort_length);
+
+build_header(linker, table_data, (void *)(table_data->data + iort_start),
+ "IORT", table_data->len - iort_start, 0, NULL, NULL);
+}
+
 static
 void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 {
@@ -2835,6 +2901,12 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 build_slit(tables_blob, tables->linker, machine);
 }
 }
+
+if (pcms->virtio_iommu) {
+acpi_add_table(table_offsets, tables_blob);
+build_iort(tables_blob, tables->linker, pcms);
+}
+
 if (acpi_get_mcfg()) {
 acpi_add_table(table_offsets, tables_blob);
 build_mcfg(tables_blob, tables->linker, );
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ac08e63604..af984ee041 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -84,6 +84,7 @@
 #include "hw/net/ne2000-isa.h"
 #include "standard-headers/asm-x86/bootparam.h"
 #include "hw/virtio/virtio-pmem-pci.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "hw/mem/memory-device.h"
 #include "sysemu/replay.h"
 #include "qapi/qmp/qerror.h"
@@ -1940,6 +1941,11 @@ static void pc_machine_device_pre_plug_cb(HotplugHandler 
*hotplug_dev,
 pc_cpu_pre_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) {
 pc_virtio_pmem_pci_pre_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+/* we declare a VIRTIO_IOMMU_RESV_MEM_T_MSI region */
+qdev_prop_set_uint32(dev, "len-reserved-regions", 1);
+qdev_prop_set_string(dev, "reserved-regions[0]",
+ "0xfee0, 0xfeef, 1");
 }
 }
 
@@ -1952,6 +1958,12 @@ static void

[PATCH for-5.0 v11 17/20] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table

2019-11-22 Thread Eric Auger

This patch builds the virtio-iommu node in the ACPI IORT table.

The RID space of the root complex, which spans 0x0-0x1
maps to streamid space 0x0-0x1 in the virtio-iommu which in
turn maps to deviceid space 0x0-0x1 in the ITS group.

The iommu RID is excluded as described in virtio-iommu
specification.

Signed-off-by: Eric Auger 

---
v8 -> v9:
- iommu RID is not fixed anymore

v7 -> v8:
- exclude the iommu RID (0x8) in the root complex ID mapping
---
 hw/arm/virt-acpi-build.c| 50 ++---
 include/hw/acpi/acpi-defs.h | 21 +++-
 2 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 825f3a79c0..1e22cbbbfd 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -386,14 +386,14 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 AcpiIortIdMapping *idmap;
 AcpiIortItsGroup *its;
 AcpiIortTable *iort;
-AcpiIortSmmu3 *smmu;
-size_t node_size, iort_node_offset, iort_length, smmu_offset = 0;
+size_t node_size, iort_node_offset, iort_length, iommu_offset = 0;
 AcpiIortRC *rc;
+int nb_rc_idmappings = 1;
 
 iort = acpi_data_push(table_data, sizeof(*iort));
 
-if (vms->iommu == VIRT_IOMMU_SMMUV3) {
-nb_nodes = 3; /* RC, ITS, SMMUv3 */
+if (vms->iommu) {
+nb_nodes = 3; /* RC, ITS, IOMMU */
 } else {
 nb_nodes = 2; /* RC, ITS */
 }
@@ -419,9 +419,9 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 
 if (vms->iommu == VIRT_IOMMU_SMMUV3) {
 int irq =  vms->irqmap[VIRT_SMMU] + ARM_SPI_BASE;
+AcpiIortSmmu3 *smmu;
 
-/* SMMUv3 node */
-smmu_offset = iort_node_offset + node_size;
+iommu_offset = iort_node_offset + node_size;
 node_size = sizeof(*smmu) + sizeof(*idmap);
 iort_length += node_size;
 smmu = acpi_data_push(table_data, node_size);
@@ -443,16 +443,38 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  */
 fill_iort_idmap(smmu->id_mapping_array, 0, 0, 0x, 0,
 iort_node_offset);
+} else if (vms->iommu == VIRT_IOMMU_VIRTIO) {
+AcpiIortPVIommuPCI *iommu;
+
+nb_rc_idmappings = 2;
+iommu_offset = iort_node_offset + node_size;
+node_size = sizeof(*iommu) + 2 * sizeof(*idmap);
+iort_length += node_size;
+iommu = acpi_data_push(table_data, node_size);
+
+iommu->type = ACPI_IORT_NODE_PARAVIRT;
+iommu->length = cpu_to_le16(node_size);
+iommu->mapping_count = cpu_to_le32(2);
+iommu->mapping_offset = cpu_to_le32(sizeof(*iommu));
+iommu->devid = cpu_to_le32(vms->virtio_iommu_bdf);
+iommu->model = cpu_to_le32(ACPI_IORT_NODE_PV_VIRTIO_IOMMU_PCI);
+
+/*
+ * Identity RID mapping covering the whole input RID range
+ * output IORT node is the ITS group node (the first node)
+ */
+fill_iort_idmap(iommu->id_mapping_array, 0, 0, 0x, 0,
+iort_node_offset);
 }
 
 /* Root Complex Node */
-node_size = sizeof(*rc) + sizeof(*idmap);
+node_size = sizeof(*rc) + nb_rc_idmappings * sizeof(*idmap);
 iort_length += node_size;
 rc = acpi_data_push(table_data, node_size);
 
 rc->type = ACPI_IORT_NODE_PCI_ROOT_COMPLEX;
 rc->length = cpu_to_le16(node_size);
-rc->mapping_count = cpu_to_le32(1);
+rc->mapping_count = cpu_to_le32(nb_rc_idmappings);
 rc->mapping_offset = cpu_to_le32(sizeof(*rc));
 
 /* fully coherent device */
@@ -463,7 +485,17 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 if (vms->iommu == VIRT_IOMMU_SMMUV3) {
 /* Identity RID mapping and output IORT node is the iommu node */
 fill_iort_idmap(rc->id_mapping_array, 0, 0, 0x, 0,
-smmu_offset);
+iommu_offset);
+} else if (vms->iommu == VIRT_IOMMU_VIRTIO) {
+/*
+ * Identity mapping with the IOMMU RID (0x8) excluded. The output
+ * IORT node is the iommu node.
+ */
+fill_iort_idmap(rc->id_mapping_array, 0, 0, vms->virtio_iommu_bdf, 0,
+iommu_offset);
+fill_iort_idmap(rc->id_mapping_array, 1, vms->virtio_iommu_bdf + 1,
+0x - vms->virtio_iommu_bdf,
+vms->virtio_iommu_bdf + 1, iommu_offset);
 } else {
 /*
  * Identity RID mapping and the output IORT node is the ITS group
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 57a3f58b0c..ba06f41fc0 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -581,7 +581,8 @@ enum {
 ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
 ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
 ACPI_IORT_NODE_SMMU = 0x03,
-ACPI_IORT_NODE_SMMU_V3

[PATCH for-5.0 v11 15/20] virtio-iommu-pci: Add array of Interval properties

2019-11-22 Thread Eric Auger

The machine may need to pass reserved regions to the
virtio-iommu-pci device (such as the MSI window on x86).
So let's add an array of Interval properties.

Signed-off-by: Eric Auger 
---
 hw/virtio/virtio-iommu-pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/virtio/virtio-iommu-pci.c b/hw/virtio/virtio-iommu-pci.c
index 4cfae1f9df..280230b31e 100644
--- a/hw/virtio/virtio-iommu-pci.c
+++ b/hw/virtio/virtio-iommu-pci.c
@@ -31,6 +31,9 @@ struct VirtIOIOMMUPCI {
 
 static Property virtio_iommu_pci_properties[] = {
 DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
+DEFINE_PROP_ARRAY("reserved-regions", VirtIOIOMMUPCI,
+  vdev.nb_reserved_regions, vdev.reserved_regions,
+  qdev_prop_interval, Interval),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.20.1

[PATCH for-5.0 v11 18/20] virtio-iommu: Support migration

2019-11-22 Thread Eric Auger

Add Migration support. We rely on recently added gtree and qlist
migration. Besides, we have to fixup end point <-> domain link.

Indeed each domain has a list of endpoints attached to it. And each
endpoint has a pointer to its domain.

Raw gtree and qlist migration cannot handle this as it re-allocates
all the nodes while reconstructing the trees/lists.

So in post_load we re-construct the relationship between endpoints
and domains.

Signed-off-by: Eric Auger 
---
 hw/virtio/virtio-iommu.c | 127 ---
 1 file changed, 117 insertions(+), 10 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index c5b202fab7..4e92fc0c95 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -692,16 +692,6 @@ static void virtio_iommu_set_features(VirtIODevice *vdev, 
uint64_t val)
 trace_virtio_iommu_set_features(dev->acked_features);
 }
 
-/*
- * Migration is not yet supported: most of the state consists
- * of balanced binary trees which are not yet ready for getting
- * migrated
- */
-static const VMStateDescription vmstate_virtio_iommu_device = {
-.name = "virtio-iommu-device",
-.unmigratable = 1,
-};
-
 static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
 {
 uint ua = GPOINTER_TO_UINT(a);
@@ -778,6 +768,123 @@ static void virtio_iommu_instance_init(Object *obj)
 {
 }
 
+#define VMSTATE_INTERVAL   \
+{  \
+.name = "interval",\
+.version_id = 1,   \
+.minimum_version_id = 1,   \
+.fields = (VMStateField[]) {   \
+VMSTATE_UINT64(low, viommu_interval),  \
+VMSTATE_UINT64(high, viommu_interval), \
+VMSTATE_END_OF_LIST()  \
+}  \
+}
+
+#define VMSTATE_MAPPING   \
+{ \
+.name = "mapping",\
+.version_id = 1,  \
+.minimum_version_id = 1,  \
+.fields = (VMStateField[]) {  \
+VMSTATE_UINT64(phys_addr, viommu_mapping),\
+VMSTATE_UINT32(flags, viommu_mapping),\
+VMSTATE_END_OF_LIST() \
+},\
+}
+
+static const VMStateDescription vmstate_interval_mapping[2] = {
+VMSTATE_MAPPING,   /* value */
+VMSTATE_INTERVAL   /* key   */
+};
+
+static int domain_preload(void *opaque)
+{
+viommu_domain *domain = opaque;
+
+domain->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
+   NULL, g_free, g_free);
+return 0;
+}
+
+static const VMStateDescription vmstate_endpoint = {
+.name = "endpoint",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(id, viommu_endpoint),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_domain = {
+.name = "domain",
+.version_id = 1,
+.minimum_version_id = 1,
+.pre_load = domain_preload,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(id, viommu_domain),
+VMSTATE_GTREE_V(mappings, viommu_domain, 1,
+vmstate_interval_mapping,
+viommu_interval, viommu_mapping),
+VMSTATE_QLIST_V(endpoint_list, viommu_domain, 1,
+vmstate_endpoint, viommu_endpoint, next),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static gboolean reconstruct_ep_domain_link(gpointer key, gpointer value,
+   gpointer data)
+{
+viommu_domain *d = (viommu_domain *)value;
+viommu_endpoint *iter, *tmp;
+viommu_endpoint *ep = (viommu_endpoint *)data;
+
+QLIST_FOREACH_SAFE(iter, >endpoint_list, next, tmp) {
+if (iter->id == ep->id) {
+/* remove the ep */
+QLIST_REMOVE(iter, next);
+g_free(iter);
+/* replace it with the good one */
+QLIST_INSERT_HEAD(>endpoint_list, ep, next);
+/* update the domain */
+ep->domain = d;
+return true; /* stop the search */
+}
+}
+return false; /* continue the traversal */
+}
+
+static gboolean fix_endpoint(gpointer key, gpointer value, gpointer data)
+{
+VirtIOIOMMU *s = (VirtIOIOMMU *)data;
+
+g_tree_foreach(s->domains, reconstruct_ep_domain_link, value);
+return false;
+}
+
+static int iommu_post_load(void *opaque, int version_id)
+{
+VirtIOIOMMU *s = opaque;
+
+g_tree_foreach(s->endpoints, fix_endpoint, s);
+return 0;
+}
+
+static const VMStateDescription vmstate_virtio_iommu_device = {
+.name = "virtio-iommu-device",
+

[PATCH for-5.0 v11 10/20] virtio-iommu-pci: Add virtio iommu pci support

2019-11-22 Thread Eric Auger

This patch adds virtio-iommu-pci, which is the pci proxy for
the virtio-iommu device.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- add the reserved_regions array property

v9 -> v10:
- include "hw/qdev-properties.h" header

v8 -> v9:
- add the msi-bypass property
- create virtio-iommu-pci.c
---
 hw/virtio/Makefile.objs  |  1 +
 hw/virtio/virtio-iommu-pci.c | 91 
 include/hw/pci/pci.h |  1 +
 include/hw/virtio/virtio-iommu.h |  1 +
 qdev-monitor.c   |  1 +
 5 files changed, 95 insertions(+)
 create mode 100644 hw/virtio/virtio-iommu-pci.c

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index f68ac14a90..33e6bc591a 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -29,6 +29,7 @@ obj-$(CONFIG_VIRTIO_INPUT_HOST) += virtio-input-host-pci.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio-input-pci.o
 obj-$(CONFIG_VIRTIO_RNG) += virtio-rng-pci.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio-balloon-pci.o
+obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu-pci.o
 obj-$(CONFIG_VIRTIO_9P) += virtio-9p-pci.o
 obj-$(CONFIG_VIRTIO_SCSI) += virtio-scsi-pci.o
 obj-$(CONFIG_VIRTIO_BLK) += virtio-blk-pci.o
diff --git a/hw/virtio/virtio-iommu-pci.c b/hw/virtio/virtio-iommu-pci.c
new file mode 100644
index 00..280230b31e
--- /dev/null
+++ b/hw/virtio/virtio-iommu-pci.c
@@ -0,0 +1,91 @@
+/*
+ * Virtio IOMMU PCI Bindings
+ *
+ * Copyright (c) 2019 Red Hat, Inc.
+ * Written by Eric Auger
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 or
+ *  (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+
+#include "virtio-pci.h"
+#include "hw/virtio/virtio-iommu.h"
+#include "hw/qdev-properties.h"
+
+typedef struct VirtIOIOMMUPCI VirtIOIOMMUPCI;
+
+/*
+ * virtio-iommu-pci: This extends VirtioPCIProxy.
+ *
+ */
+#define VIRTIO_IOMMU_PCI(obj) \
+OBJECT_CHECK(VirtIOIOMMUPCI, (obj), TYPE_VIRTIO_IOMMU_PCI)
+
+struct VirtIOIOMMUPCI {
+VirtIOPCIProxy parent_obj;
+VirtIOIOMMU vdev;
+};
+
+static Property virtio_iommu_pci_properties[] = {
+DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
+DEFINE_PROP_ARRAY("reserved-regions", VirtIOIOMMUPCI,
+  vdev.nb_reserved_regions, vdev.reserved_regions,
+  qdev_prop_interval, Interval),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_iommu_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+VirtIOIOMMUPCI *dev = VIRTIO_IOMMU_PCI(vpci_dev);
+DeviceState *vdev = DEVICE(>vdev);
+
+qdev_set_parent_bus(vdev, BUS(_dev->bus));
+object_property_set_link(OBJECT(dev),
+ OBJECT(pci_get_bus(_dev->pci_dev)),
+ "primary-bus", errp);
+object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void virtio_iommu_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+k->realize = virtio_iommu_pci_realize;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+dc->props = virtio_iommu_pci_properties;
+pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_IOMMU;
+pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+pcidev_k->class_id = PCI_CLASS_OTHERS;
+}
+
+static void virtio_iommu_pci_instance_init(Object *obj)
+{
+VirtIOIOMMUPCI *dev = VIRTIO_IOMMU_PCI(obj);
+
+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
+TYPE_VIRTIO_IOMMU);
+}
+
+static const VirtioPCIDeviceTypeInfo virtio_iommu_pci_info = {
+.base_name = TYPE_VIRTIO_IOMMU_PCI,
+.generic_name  = "virtio-iommu-pci",
+.transitional_name = "virtio-iommu-pci-transitional",
+.non_transitional_name = "virtio-iommu-pci-non-transitional",
+.instance_size = sizeof(VirtIOIOMMUPCI),
+.instance_init = virtio_iommu_pci_instance_init,
+.class_init= virtio_iommu_pci_class_init,
+};
+
+static void virtio_iommu_pci_register(void)
+{
+virtio_pci_types_register(_iommu_pci_info);
+}
+
+type_init(virtio_iommu_pci_register)
+
+
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index db75c6dfd0..d7715c826a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -86,6 +86,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_VIRTIO_9P  0x1009
 #define PCI_DEVICE_ID_VIRTIO_VSOCK   0x1012
 #define PCI_DEVICE_ID_VIRTIO_PMEM0x1013
+#define PCI_DEVICE_ID_VIRTIO_IOMMU   0x1014
 
 #define PCI_VENDOR_ID_REDHAT 0x1b36
 #define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index f55f48d304..1ab6993d29 100644
--- a/include/hw/virtio/virtio-iommu.h
+++

[PATCH for-5.0 v11 14/20] virtio-iommu: Handle reserved regions in the translation process

2019-11-22 Thread Eric Auger

When translating an address we need to check if it belongs to
a reserved virtual address range. If it does, there are 2 cases:

- it belongs to a RESERVED region: the guest should neither use
  this address in a MAP not instruct the end-point to DMA on
  them. We report an error

- It belongs to an MSI region: we bypass the translation.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- directly use the reserved_regions properties array

v9 -> v10:
- in case of MSI region, we immediatly return
---
 hw/virtio/virtio-iommu.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1ce2218935..c5b202fab7 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -548,6 +548,7 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 uint32_t sid, flags;
 bool bypass_allowed;
 bool found;
+int i;
 
 interval.low = addr;
 interval.high = addr + 1;
@@ -580,6 +581,22 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 goto unlock;
 }
 
+for (i = 0; i < s->nb_reserved_regions; i++) {
+if (interval.low >= s->reserved_regions[i].low &&
+interval.low <= s->reserved_regions[i].high) {
+switch (s->reserved_regions[i].type) {
+case VIRTIO_IOMMU_RESV_MEM_T_MSI:
+entry.perm = flag;
+goto unlock;
+case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
+default:
+virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
+  0, sid, addr);
+goto unlock;
+   }
+}
+}
+
 if (!ep->domain) {
 if (!bypass_allowed) {
 qemu_log_mask(LOG_GUEST_ERROR,
-- 
2.20.1

[PATCH for-5.0 v11 09/20] virtio-iommu: Implement fault reporting

2019-11-22 Thread Eric Auger

The event queue allows to report asynchronous errors.
The translate function now injects faults when relevant.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- change a virtio_error into an error_report_once
  (no buffer available for output faults)
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 69 +---
 2 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index de7cbb3c8f..a572eb71aa 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -73,3 +73,4 @@ virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
 virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
+virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, 
uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index a83666557b..723616a5db 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -407,6 +407,51 @@ out:
 }
 }
 
+static void virtio_iommu_report_fault(VirtIOIOMMU *viommu, uint8_t reason,
+  uint32_t flags, uint32_t endpoint,
+  uint64_t address)
+{
+VirtIODevice *vdev = >parent_obj;
+VirtQueue *vq = viommu->event_vq;
+struct virtio_iommu_fault fault;
+VirtQueueElement *elem;
+size_t sz;
+
+memset(, 0, sizeof(fault));
+fault.reason = reason;
+fault.flags = flags;
+fault.endpoint = endpoint;
+fault.address = address;
+
+for (;;) {
+elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+
+if (!elem) {
+error_report_once(
+"no buffer available in event queue to report event");
+return;
+}
+
+if (iov_size(elem->in_sg, elem->in_num) < sizeof(fault)) {
+virtio_error(vdev, "error buffer of wrong size");
+virtqueue_detach_element(vq, elem, 0);
+g_free(elem);
+continue;
+}
+break;
+}
+/* we have a buffer to fill in */
+sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+  , sizeof(fault));
+assert(sz == sizeof(fault));
+
+trace_virtio_iommu_report_fault(reason, flags, endpoint, address);
+virtqueue_push(vq, elem, sz);
+virtio_notify(vdev, vq);
+g_free(elem);
+
+}
+
 static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 IOMMUAccessFlags flag,
 int iommu_idx)
@@ -415,9 +460,10 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 viommu_interval interval, *mapping_key;
 viommu_mapping *mapping_value;
 VirtIOIOMMU *s = sdev->viommu;
+bool read_fault, write_fault;
 viommu_endpoint *ep;
+uint32_t sid, flags;
 bool bypass_allowed;
-uint32_t sid;
 bool found;
 
 interval.low = addr;
@@ -443,6 +489,8 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 if (!ep) {
 if (!bypass_allowed) {
 error_report_once("%s sid=%d is not known!!", __func__, sid);
+virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_UNKNOWN,
+  0, sid, 0);
 } else {
 entry.perm = flag;
 }
@@ -455,6 +503,8 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
   "%s %02x:%02x.%01x not attached to any domain\n",
   __func__, PCI_BUS_NUM(sid),
   PCI_SLOT(sid), PCI_FUNC(sid));
+virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_DOMAIN,
+  0, sid, 0);
 } else {
 entry.perm = flag;
 }
@@ -468,16 +518,25 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 qemu_log_mask(LOG_GUEST_ERROR,
   "%s no mapping for 0x%"PRIx64" for sid=%d\n",
   __func__, addr, sid);
+virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
+  0, sid, addr);
 goto unlock;
 }
 
-if (((flag & IOMMU_RO) &&
-!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
-((flag & IOMMU_WO) &&
-!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
+read_fault = (flag & IOMMU_RO) &&
+!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_READ);
+write_fault = (flag & IOMMU_WO) &&
+!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_WRITE);
+
+flags = read_fault ? VIRTIO_IOMMU_FAULT_F_READ : 0;
+flags |=

[PATCH for-5.0 v11 13/20] virtio-iommu: Implement probe request

2019-11-22 Thread Eric Auger

This patch implements the PROBE request. At the moment,
no reserved regions are returned as none are registered
per device. Only a NONE property is returned.

Signed-off-by: Eric Auger 

---
v9 -> v10
- fully rewrite the code in preparation of
  reserved_regions array property introduction

v8 -> v9:
- fix filling of properties (changes induced by v0.7 -> v0.8 spec
  evolution)
- return VIRTIO_IOMMU_S_INVAL in case of error

v7 -> v8:
- adapt to removal of value filed in virtio_iommu_probe_property

v6 -> v7:
- adapt to the change in virtio_iommu_probe_resv_mem fields
- use get_endpoint() instead of directly checking the EP
  was registered.

v4 -> v5:
- initialize bufstate.error to false
- add cpu_to_le64(size)
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 89 +++-
 include/hw/virtio/virtio-iommu.h |  2 +
 3 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a572eb71aa..b7bc8ac6d1 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -74,3 +74,4 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, 
uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t 
start, uint64_t end) "dev= %d, type=%d start=0x%"PRIx64" end=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 723616a5db..1ce2218935 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -38,6 +38,7 @@
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
+#define VIOMMU_PROBE_SIZE 512
 
 typedef struct viommu_domain {
 uint32_t id;
@@ -317,6 +318,61 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
 return ret;
 }
 
+static ssize_t virtio_iommu_fill_resv_mem_prop(VirtIOIOMMU *s, uint32_t ep,
+   uint8_t *buf, size_t free)
+{
+struct virtio_iommu_probe_resv_mem prop = {};
+size_t size = sizeof(prop), length = size - sizeof(prop.head), total;
+int i;
+
+total = size * s->nb_reserved_regions;
+
+if (total > free) {
+return -ENOSPC;
+}
+
+for (i = 0; i < s->nb_reserved_regions; i++) {
+prop.head.type = VIRTIO_IOMMU_PROBE_T_RESV_MEM;
+prop.head.length = cpu_to_le64(length);
+prop.subtype = cpu_to_le64(s->reserved_regions[i].type);
+prop.start = cpu_to_le64(s->reserved_regions[i].low);
+prop.end = cpu_to_le64(s->reserved_regions[i].high);
+
+memcpy(buf, , size);
+
+trace_virtio_iommu_fill_resv_property(ep, prop.subtype,
+  prop.start, prop.end);
+buf += size;
+}
+return total;
+}
+
+/**
+ * virtio_iommu_probe - Fill the probe request buffer with
+ * the properties the device is able to return and add a NONE
+ * property at the end.
+ */
+static int virtio_iommu_probe(VirtIOIOMMU *s,
+  struct virtio_iommu_req_probe *req,
+  uint8_t *buf)
+{
+uint32_t ep_id = le32_to_cpu(req->endpoint);
+struct virtio_iommu_probe_property last = {};
+size_t free = VIOMMU_PROBE_SIZE - sizeof(last);
+ssize_t count;
+
+count = virtio_iommu_fill_resv_mem_prop(s, ep_id, buf, free);
+if (count < 0) {
+return VIRTIO_IOMMU_S_INVAL;
+}
+buf += count;
+free -= count;
+
+memcpy(buf, , sizeof(last));
+
+return VIRTIO_IOMMU_S_OK;
+}
+
 static int virtio_iommu_iov_to_req(struct iovec *iov,
unsigned int iov_cnt,
void *req, size_t req_sz)
@@ -346,6 +402,17 @@ virtio_iommu_handle_req(detach)
 virtio_iommu_handle_req(map)
 virtio_iommu_handle_req(unmap)
 
+static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
+ struct iovec *iov,
+ unsigned int iov_cnt,
+ uint8_t *buf)
+{
+struct virtio_iommu_req_probe req;
+int ret = virtio_iommu_iov_to_req(iov, iov_cnt, , sizeof(req));
+
+return ret ? ret : virtio_iommu_probe(s, , buf);
+}
+
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
@@ -391,17 +458,33 @@ static void virtio_iommu_handle_command(VirtIODevice 
*vdev, VirtQueue *vq)
 case VIRTIO_IOMMU_T_UNMAP:
 tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
 break;
+case VIRTIO_IOMMU_T_PROBE:
+{
+struct virtio_iommu_req_tail *ptail;
+uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
+
+

[PATCH for-5.0 v11 20/20] tests: Add virtio-iommu test

2019-11-22 Thread Eric Auger

This adds the framework to test the virtio-iommu-pci device
and tests exercising the attach/detach, map/unmap API.

To run the tests:
make tests/qos-test
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/qos-test V=1

Signed-off-by: Eric Auger 
---
 tests/Makefile.include  |   2 +
 tests/libqos/virtio-iommu.c | 177 
 tests/libqos/virtio-iommu.h |  45 +++
 tests/virtio-iommu-test.c   | 261 
 4 files changed, 485 insertions(+)
 create mode 100644 tests/libqos/virtio-iommu.c
 create mode 100644 tests/libqos/virtio-iommu.h
 create mode 100644 tests/virtio-iommu-test.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 8566f5f119..76a303c4fb 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -734,6 +734,7 @@ qos-test-obj-y += tests/libqos/virtio-net.o
 qos-test-obj-y += tests/libqos/virtio-pci.o
 qos-test-obj-y += tests/libqos/virtio-pci-modern.o
 qos-test-obj-y += tests/libqos/virtio-rng.o
+qos-test-obj-y += tests/libqos/virtio-iommu.o
 qos-test-obj-y += tests/libqos/virtio-scsi.o
 qos-test-obj-y += tests/libqos/virtio-serial.o
 
@@ -773,6 +774,7 @@ qos-test-obj-$(CONFIG_VIRTFS) += tests/virtio-9p-test.o
 qos-test-obj-y += tests/virtio-blk-test.o
 qos-test-obj-y += tests/virtio-net-test.o
 qos-test-obj-y += tests/virtio-rng-test.o
+qos-test-obj-y += tests/virtio-iommu-test.o
 qos-test-obj-y += tests/virtio-scsi-test.o
 qos-test-obj-y += tests/virtio-serial-test.o
 qos-test-obj-y += tests/vmxnet3-test.o
diff --git a/tests/libqos/virtio-iommu.c b/tests/libqos/virtio-iommu.c
new file mode 100644
index 00..b4e9ea44fb
--- /dev/null
+++ b/tests/libqos/virtio-iommu.c
@@ -0,0 +1,177 @@
+/*
+ * libqos driver framework
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/module.h"
+#include "libqos/qgraph.h"
+#include "libqos/virtio-iommu.h"
+#include "hw/virtio/virtio-iommu.h"
+
+static QGuestAllocator *alloc;
+
+/* virtio-iommu-device */
+static void *qvirtio_iommu_get_driver(QVirtioIOMMU *v_iommu,
+  const char *interface)
+{
+if (!g_strcmp0(interface, "virtio-iommu")) {
+return v_iommu;
+}
+if (!g_strcmp0(interface, "virtio")) {
+return v_iommu->vdev;
+}
+
+fprintf(stderr, "%s not present in virtio-iommu-device\n", interface);
+g_assert_not_reached();
+}
+
+static void *qvirtio_iommu_device_get_driver(void *object,
+ const char *interface)
+{
+QVirtioIOMMUDevice *v_iommu = object;
+return qvirtio_iommu_get_driver(_iommu->iommu, interface);
+}
+
+static void virtio_iommu_cleanup(QVirtioIOMMU *interface)
+{
+qvirtqueue_cleanup(interface->vdev->bus, interface->vq, alloc);
+}
+
+static void virtio_iommu_setup(QVirtioIOMMU *interface)
+{
+QVirtioDevice *vdev = interface->vdev;
+uint64_t features;
+
+features = qvirtio_get_features(vdev);
+features &= ~(QVIRTIO_F_BAD_FEATURE |
+  (1ull << VIRTIO_RING_F_INDIRECT_DESC) |
+  (1ull << VIRTIO_RING_F_EVENT_IDX) |
+  (1ull << VIRTIO_IOMMU_F_BYPASS));
+qvirtio_set_features(vdev, features);
+interface->vq = qvirtqueue_setup(interface->vdev, alloc, 0);
+qvirtio_set_driver_ok(interface->vdev);
+}
+
+static void qvirtio_iommu_device_destructor(QOSGraphObject *obj)
+{
+QVirtioIOMMUDevice *v_iommu = (QVirtioIOMMUDevice *) obj;
+QVirtioIOMMU *iommu = _iommu->iommu;
+
+virtio_iommu_cleanup(iommu);
+}
+
+static void qvirtio_iommu_device_start_hw(QOSGraphObject *obj)
+{
+QVirtioIOMMUDevice *v_iommu = (QVirtioIOMMUDevice *) obj;
+QVirtioIOMMU *iommu = _iommu->iommu;
+
+virtio_iommu_setup(iommu);
+}
+
+static void *virtio_iommu_device_create(void *virtio_dev,
+QGuestAllocator *t_alloc,
+void *addr)
+{
+QVirtioIOMMUDevice *virtio_rdevice = g_new0(QVirtioIOMMUDevice, 1);
+QVirtioIOMMU *interface = _rdevice->iommu;
+
+interface->vdev = virtio_dev;
+alloc = t_alloc;
+
+virtio_rdevice->obj.get_driver = qvirtio_iommu_device_get_driver;
+virtio_rdevice->obj.destructor = qvirtio_iommu_device_destructor;
+virtio_rdevice->obj.start_hw = qvirtio_iommu_device_start_hw;
+
+

[PATCH for-5.0 v11 02/20] virtio-iommu: Add skeleton

2019-11-22 Thread Eric Auger

This patchs adds the skeleton for the virtio-iommu device.

Signed-off-by: Eric Auger 

---

v9 -> v10:
- mutex initialized here
- initialize tail
- included hw/qdev-properties.h
- removed g_memdup
- removed s->config.domain_range.start = 0;

v9 -> v10:
- expose VIRTIO_IOMMU_F_MMIO feature
- s/domain_bits/domain_range struct
- change error codes
- enforce unmigratable
- Kconfig

v7 -> v8:
- expose VIRTIO_IOMMU_F_BYPASS and VIRTIO_F_VERSION_1
  features
- set_config dummy implementation + tracing
- add trace in get_features
- set the features on realize() and store the acked ones
- remove inclusion of linux/virtio_iommu.h

v6 -> v7:
- removed qapi-event.h include
- add primary_bus and associated property

v4 -> v5:
- use the new v0.5 terminology (domain, endpoint)
- add the event virtqueue

v3 -> v4:
- use page_size_mask instead of page_sizes
- added set_features()
- added some traces (reset, set_status, set_features)
- empty virtio_iommu_set_config() as the driver MUST NOT
  write to device configuration fields
- add get_config trace

v2 -> v3:
- rebase on 2.10-rc0, ie. use IOMMUMemoryRegion and remove
  iommu_ops.
- advertise VIRTIO_IOMMU_F_MAP_UNMAP feature
- page_sizes set to TARGET_PAGE_SIZE

Conflicts:
hw/virtio/trace-events
---
 hw/virtio/Kconfig|   5 +
 hw/virtio/Makefile.objs  |   1 +
 hw/virtio/trace-events   |   8 +
 hw/virtio/virtio-iommu.c | 274 +++
 include/hw/virtio/virtio-iommu.h |  62 +++
 5 files changed, 350 insertions(+)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index 3724ff8bac..a30107b439 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -6,6 +6,11 @@ config VIRTIO_RNG
 default y
 depends on VIRTIO
 
+config VIRTIO_IOMMU
+bool
+default y
+depends on VIRTIO
+
 config VIRTIO_PCI
 bool
 default y if PCI_DEVICES
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index e2f70fbb89..f68ac14a90 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -16,6 +16,7 @@ obj-$(call land,$(CONFIG_VIRTIO_CRYPTO),$(CONFIG_VIRTIO_PCI)) 
+= virtio-crypto-p
 obj-$(CONFIG_VIRTIO_PMEM) += virtio-pmem.o
 common-obj-$(call land,$(CONFIG_VIRTIO_PMEM),$(CONFIG_VIRTIO_PCI)) += 
virtio-pmem-pci.o
 obj-$(call land,$(CONFIG_VHOST_USER_FS),$(CONFIG_VIRTIO_PCI)) += 
vhost-user-fs-pci.o
+obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
 
 ifeq ($(CONFIG_VIRTIO_PCI),y)
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e28ba48da6..f7dac39213 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -53,3 +53,11 @@ virtio_mmio_write_offset(uint64_t offset, uint64_t value) 
"virtio_mmio_write off
 virtio_mmio_guest_page(uint64_t size, int shift) "guest page size 0x%" PRIx64 
" shift %d"
 virtio_mmio_queue_write(uint64_t value, int max_size) "mmio_queue write 0x%" 
PRIx64 " max %d"
 virtio_mmio_setting_irq(int level) "virtio_mmio setting IRQ %d"
+
+# hw/virtio/virtio-iommu.c
+virtio_iommu_device_reset(void) "reset!"
+virtio_iommu_get_features(uint64_t features) "device supports 
features=0x%"PRIx64
+virtio_iommu_set_features(uint64_t features) "features accepted by the driver 
=0x%"PRIx64
+virtio_iommu_device_status(uint8_t status) "driver status = %d"
+virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_range=%d probe_size=0x%x"
+virtio_iommu_set_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_bits=%d probe_size=0x%x"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
new file mode 100644
index 00..7b25db3713
--- /dev/null
+++ b/hw/virtio/virtio-iommu.c
@@ -0,0 +1,274 @@
+/*
+ * virtio-iommu device
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/iov.h"
+#include "qemu-common.h"
+#include "hw/qdev-properties.h"
+#include "hw/virtio/virtio.h"
+#include "sysemu/kvm.h"
+#include "trace.h"
+
+#include "standard-headers/linux/virtio_ids.h"
+
+#include "hw/virtio/virtio-bus.h"

[PATCH for-5.0 v11 12/20] qapi: Introduce DEFINE_PROP_INTERVAL

2019-11-22 Thread Eric Auger

Introduce a new property defining a labelled interval:
,,label.

This will be used to encode reserved IOVA regions. The label
is left undefined to ease reuse accross use cases.

For instance, in virtio-iommu use case, reserved IOVA regions
will be passed by the machine code to the virtio-iommu-pci
device (an array of those). The label will match the
virtio_iommu_probe_resv_mem subtype value:
- VIRTIO_IOMMU_RESV_MEM_T_RESERVED (0)
- VIRTIO_IOMMU_RESV_MEM_T_MSI (1)

This is used to inform the virtio-iommu-pci device it should
bypass the MSI region: 0xfee0, 0xfeef, 1.

Signed-off-by: Eric Auger 
---
 hw/core/qdev-properties.c| 90 
 include/exec/memory.h|  6 +++
 include/hw/qdev-properties.h |  3 ++
 include/qemu/typedefs.h  |  1 +
 4 files changed, 100 insertions(+)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index ac28890e5a..8d70f34e37 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -13,6 +13,7 @@
 #include "qapi/visitor.h"
 #include "chardev/char.h"
 #include "qemu/uuid.h"
+#include "qemu/cutils.h"
 
 void qdev_prop_set_after_realize(DeviceState *dev, const char *name,
   Error **errp)
@@ -585,6 +586,95 @@ const PropertyInfo qdev_prop_macaddr = {
 .set   = set_mac,
 };
 
+/* --- Labelled Interval --- */
+
+/*
+ * accepted syntax versions:
+ *   ,,
+ *   where low/high addresses are uint64_t in hexa (feat. 0x prefix)
+ *   and type is an unsigned integer
+ */
+static void get_interval(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+Interval *interval = qdev_get_prop_ptr(dev, prop);
+char buffer[64];
+char *p = buffer;
+
+snprintf(buffer, sizeof(buffer), "0x%"PRIx64",0x%"PRIx64",%d",
+ interval->low, interval->high, interval->type);
+
+visit_type_str(v, name, , errp);
+}
+
+static void set_interval(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+Interval *interval = qdev_get_prop_ptr(dev, prop);
+Error *local_err = NULL;
+unsigned int type;
+gchar **fields;
+uint64_t addr;
+char *str;
+int ret;
+
+if (dev->realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_str(v, name, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+fields = g_strsplit(str, ",", 3);
+
+ret = qemu_strtou64(fields[0], NULL, 16, );
+if (!ret) {
+interval->low = addr;
+} else {
+error_setg(errp, "Failed to decode interval low addr");
+error_append_hint(errp,
+  "should be an address in hexa with 0x prefix\n");
+goto out;
+}
+
+ret = qemu_strtou64(fields[1], NULL, 16, );
+if (!ret) {
+interval->high = addr;
+} else {
+error_setg(errp, "Failed to decode interval high addr");
+error_append_hint(errp,
+  "should be an address in hexa with 0x prefix\n");
+goto out;
+}
+
+ret = qemu_strtoui(fields[2], NULL, 10, );
+if (!ret) {
+interval->type = type;
+} else {
+error_setg(errp, "Failed to decode interval type");
+error_append_hint(errp, "should be an unsigned int in decimal\n");
+}
+out:
+g_free(str);
+g_strfreev(fields);
+return;
+}
+
+const PropertyInfo qdev_prop_interval = {
+.name  = "labelled_interval",
+.description = "Labelled interval, example: 0xFEE0,0xFEEF,0",
+.get   = get_interval,
+.set   = set_interval,
+};
+
 /* --- on/off/auto --- */
 
 const PropertyInfo qdev_prop_on_off_auto = {
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e499dc215b..e238d1c352 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -57,6 +57,12 @@ struct MemoryRegionMmio {
 CPUWriteMemoryFunc *write[3];
 };
 
+struct Interval {
+hwaddr low;
+hwaddr high;
+unsigned int type;
+};
+
 typedef struct IOMMUTLBEntry IOMMUTLBEntry;
 
 /* See address_space_translate: bit 0 is read, bit 1 is write.  */
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index c6a8cb5516..2ba7c8711b 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -20,6 +20,7 @@ extern const PropertyInfo qdev_prop_chr;
 extern const PropertyInfo qdev_prop_tpm;
 extern const PropertyInfo qdev_prop_ptr;
 extern const PropertyInfo qdev_prop_macaddr;
+extern const PropertyInfo qdev_prop_interval;
 extern const PropertyInfo qdev_prop_on_off_auto;
 extern const PropertyInfo qdev_prop_losttickpolicy;
 extern const PropertyInfo qdev_prop_blockdev_on_error;
@@ -202,6 +203,8 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
 DEFINE_PROP(_n,

[PATCH for-5.0 v11 11/20] hw/arm/virt: Add the virtio-iommu device tree mappings

2019-11-22 Thread Eric Auger

Adds the "virtio,pci-iommu" node in the host bridge node and
the RID mapping, excluding the IOMMU RID.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- remove msi_bypass

v8 -> v9:
- disable msi-bypass property
- addition of the subnode is handled is the hotplug handler
  and IOMMU RID is notimposed anymore

v6 -> v7:
- align to the smmu instantiation code

v4 -> v5:
- VirtMachineClass no_iommu added in this patch
- Use object_resolve_path_type
---
 hw/arm/virt.c| 53 +++-
 hw/virtio/virtio-iommu-pci.c |  3 --
 include/hw/arm/virt.h|  2 ++
 3 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d4bedc2607..cb6a95e7c8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -32,6 +32,7 @@
 #include "qemu-common.h"
 #include "qemu/units.h"
 #include "qemu/option.h"
+#include "monitor/qdev.h"
 #include "qapi/error.h"
 #include "hw/sysbus.h"
 #include "hw/boards.h"
@@ -54,6 +55,7 @@
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "hw/pci-host/gpex.h"
+#include "hw/virtio/virtio-pci.h"
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/platform-bus.h"
 #include "hw/qdev-properties.h"
@@ -71,6 +73,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/virtio/virtio-iommu.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
 static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1181,6 +1184,30 @@ static void create_smmu(const VirtMachineState *vms, 
qemu_irq *pic,
 g_free(node);
 }
 
+static void create_virtio_iommu(VirtMachineState *vms, Error **errp)
+{
+const char compat[] = "virtio,pci-iommu";
+uint16_t bdf = vms->virtio_iommu_bdf;
+char *node;
+
+vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
+node = g_strdup_printf("%s/virtio_iommu@%d", vms->pciehb_nodename, bdf);
+qemu_fdt_add_subnode(vms->fdt, node);
+qemu_fdt_setprop(vms->fdt, node, "compatible", compat, sizeof(compat));
+qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg",
+ 1, bdf << 8, 1, 0, 1, 0,
+ 1, 0, 1, 0);
+
+qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1);
+qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms->iommu_phandle);
+g_free(node);
+
+qemu_fdt_setprop_cells(vms->fdt, vms->pciehb_nodename, "iommu-map",
+   0x0, vms->iommu_phandle, 0x0, bdf,
+   bdf + 1, vms->iommu_phandle, bdf + 1, 0x - bdf);
+}
+
 static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
 hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
@@ -1258,7 +1285,7 @@ static void create_pcie(VirtMachineState *vms, qemu_irq 
*pic)
 }
 }
 
-nodename = g_strdup_printf("/pcie@%" PRIx64, base);
+nodename = vms->pciehb_nodename = g_strdup_printf("/pcie@%" PRIx64, base);
 qemu_fdt_add_subnode(vms->fdt, nodename);
 qemu_fdt_setprop_string(vms->fdt, nodename,
 "compatible", "pci-host-ecam-generic");
@@ -1301,13 +1328,17 @@ static void create_pcie(VirtMachineState *vms, qemu_irq 
*pic)
 if (vms->iommu) {
 vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
 
-create_smmu(vms, pic, pci->bus);
+switch (vms->iommu) {
+case VIRT_IOMMU_SMMUV3:
+create_smmu(vms, pic, pci->bus);
+qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map",
+   0x0, vms->iommu_phandle, 0x0, 0x1);
+break;
+default:
+g_assert_not_reached();
+}
 
-qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map",
-   0x0, vms->iommu_phandle, 0x0, 0x1);
 }
-
-g_free(nodename);
 }
 
 static void create_platform_bus(VirtMachineState *vms, qemu_irq *pic)
@@ -1972,6 +2003,13 @@ static void virt_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
 if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
 virt_memory_plug(hotplug_dev, dev, errp);
 }
+if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+PCIDevice *pdev = PCI_DEVICE(dev);
+
+vms->iommu = VIRT_IOMMU_VIRTIO;
+vms->virtio_iommu_bdf = pci_get_bdf(pdev);
+create_virtio_iommu(vms, errp);
+}
 }
 
 static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
@@ -1985,7 +2023,8 @@ static HotplugHandler 
*virt_machine_get_hotplug_handler(MachineState *machine,
 DeviceState *dev)
 {
 if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
-   (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
+   (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) ||
+   (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI))) {
 return HOTPLUG_HANDLER(machine);
 }
 
diff --git

Re: [PATCH] linux-user: fix translation of statx structures

2019-11-22 Thread Aleksandar Markovic

On Fri, Nov 22, 2019 at 7:22 PM Ariadne Conill  wrote:
>
> All timestamps were copied to atime instead of to their respective
> fields.
>
> Signed-off-by: Ariadne Conill 
> ---

What a bug.

Laurent, perhaps a good candidate for 4.2?

Thanks for submitting this, Ariadne Conill!

Reviewed-by: Aleksandar Markovic 

>  linux-user/syscall.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index ce399a55f0..171c0caef3 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -6743,12 +6743,12 @@ static inline abi_long host_to_target_statx(struct 
> target_statx *host_stx,
>  __put_user(host_stx->stx_attributes_mask, 
> _stx->stx_attributes_mask);
>  __put_user(host_stx->stx_atime.tv_sec, _stx->stx_atime.tv_sec);
>  __put_user(host_stx->stx_atime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_btime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_atime.tv_nsec);
> -__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_atime.tv_sec);
> -__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_atime.tv_nsec);
> +__put_user(host_stx->stx_btime.tv_sec, _stx->stx_btime.tv_sec);
> +__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_btime.tv_nsec);
> +__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_ctime.tv_sec);
> +__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_ctime.tv_nsec);
> +__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_mtime.tv_sec);
> +__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_mtime.tv_nsec);
>  __put_user(host_stx->stx_rdev_major, _stx->stx_rdev_major);
>  __put_user(host_stx->stx_rdev_minor, _stx->stx_rdev_minor);
>  __put_user(host_stx->stx_dev_major, _stx->stx_dev_major);
> --
> 2.24.0
>
>

[PATCH for-5.0 v11 08/20] virtio-iommu: Implement translate

2019-11-22 Thread Eric Auger

This patch implements the translate callback

Signed-off-by: Eric Auger 

---

v10 -> v11:
- take into account the new value struct and use
  g_tree_lookup_extended
- switched to error_report_once

v6 -> v7:
- implemented bypass-mode

v5 -> v6:
- replace error_report by qemu_log_mask

v4 -> v5:
- check the device domain is not NULL
- s/printf/error_report
- set flags to IOMMU_NONE in case of all translation faults
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 63 +++-
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index f25359cee2..de7cbb3c8f 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -72,3 +72,4 @@ virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d"
 virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
 virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
+virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index f0a56833a2..a83666557b 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -412,19 +412,80 @@ static IOMMUTLBEntry 
virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
 int iommu_idx)
 {
 IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+viommu_interval interval, *mapping_key;
+viommu_mapping *mapping_value;
+VirtIOIOMMU *s = sdev->viommu;
+viommu_endpoint *ep;
+bool bypass_allowed;
 uint32_t sid;
+bool found;
+
+interval.low = addr;
+interval.high = addr + 1;
 
 IOMMUTLBEntry entry = {
 .target_as = _space_memory,
 .iova = addr,
 .translated_addr = addr,
-.addr_mask = ~(hwaddr)0,
+.addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
 .perm = IOMMU_NONE,
 };
 
+bypass_allowed = virtio_has_feature(s->acked_features,
+VIRTIO_IOMMU_F_BYPASS);
+
 sid = virtio_iommu_get_sid(sdev);
 
 trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+qemu_mutex_lock(>mutex);
+
+ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+if (!ep) {
+if (!bypass_allowed) {
+error_report_once("%s sid=%d is not known!!", __func__, sid);
+} else {
+entry.perm = flag;
+}
+goto unlock;
+}
+
+if (!ep->domain) {
+if (!bypass_allowed) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s %02x:%02x.%01x not attached to any domain\n",
+  __func__, PCI_BUS_NUM(sid),
+  PCI_SLOT(sid), PCI_FUNC(sid));
+} else {
+entry.perm = flag;
+}
+goto unlock;
+}
+
+found = g_tree_lookup_extended(ep->domain->mappings, (gpointer)(),
+   (void **)_key,
+   (void **)_value);
+if (!found) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s no mapping for 0x%"PRIx64" for sid=%d\n",
+  __func__, addr, sid);
+goto unlock;
+}
+
+if (((flag & IOMMU_RO) &&
+!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
+((flag & IOMMU_WO) &&
+!(mapping_value->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "Permission error on 0x%"PRIx64"(%d): allowed=%d\n",
+  addr, flag, mapping_value->flags);
+goto unlock;
+}
+entry.translated_addr = addr - mapping_key->low + mapping_value->phys_addr;
+entry.perm = flag;
+trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
+
+unlock:
+qemu_mutex_unlock(>mutex);
 return entry;
 }
 
-- 
2.20.1

[PATCH for-5.0 v11 16/20] hw/arm/virt-acpi-build: Introduce fill_iort_idmap helper

2019-11-22 Thread Eric Auger

To avoid code duplication, let's introduce an helper that
fills one IORT ID mappings array index.

Signed-off-by: Eric Auger 

---

v8: new
---
 hw/arm/virt-acpi-build.c | 43 
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 4cd50175e0..825f3a79c0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -368,6 +368,17 @@ static void acpi_dsdt_add_power_button(Aml *scope)
 aml_append(scope, dev);
 }
 
+static inline void
+fill_iort_idmap(AcpiIortIdMapping *idmap, int i,
+uint32_t input_base, uint32_t id_count,
+uint32_t output_base, uint32_t output_reference)
+{
+idmap[i].input_base = cpu_to_le32(input_base);
+idmap[i].id_count = cpu_to_le32(id_count);
+idmap[i].output_base = cpu_to_le32(output_base);
+idmap[i].output_reference = cpu_to_le32(output_reference);
+}
+
 static void
 build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
@@ -426,13 +437,12 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 smmu->gerr_gsiv = cpu_to_le32(irq + 2);
 smmu->sync_gsiv = cpu_to_le32(irq + 3);
 
-/* Identity RID mapping covering the whole input RID range */
-idmap = >id_mapping_array[0];
-idmap->input_base = 0;
-idmap->id_count = cpu_to_le32(0x);
-idmap->output_base = 0;
-/* output IORT node is the ITS group node (the first node) */
-idmap->output_reference = cpu_to_le32(iort_node_offset);
+/*
+ * Identity RID mapping covering the whole input RID range.
+ * The output IORT node is the ITS group node (the first node).
+ */
+fill_iort_idmap(smmu->id_mapping_array, 0, 0, 0x, 0,
+iort_node_offset);
 }
 
 /* Root Complex Node */
@@ -450,18 +460,17 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 rc->memory_properties.memory_flags = 0x3; /* CCA = CPM = DCAS = 1 */
 rc->pci_segment_number = 0; /* MCFG pci_segment */
 
-/* Identity RID mapping covering the whole input RID range */
-idmap = >id_mapping_array[0];
-idmap->input_base = 0;
-idmap->id_count = cpu_to_le32(0x);
-idmap->output_base = 0;
-
 if (vms->iommu == VIRT_IOMMU_SMMUV3) {
-/* output IORT node is the smmuv3 node */
-idmap->output_reference = cpu_to_le32(smmu_offset);
+/* Identity RID mapping and output IORT node is the iommu node */
+fill_iort_idmap(rc->id_mapping_array, 0, 0, 0x, 0,
+smmu_offset);
 } else {
-/* output IORT node is the ITS group node (the first node) */
-idmap->output_reference = cpu_to_le32(iort_node_offset);
+/*
+ * Identity RID mapping and the output IORT node is the ITS group
+ * node (the first node).
+ */
+fill_iort_idmap(rc->id_mapping_array, 0, 0, 0x, 0,
+iort_node_offset);
 }
 
 /*
-- 
2.20.1

[PATCH for-5.0 v11 07/20] virtio-iommu: Implement map/unmap

2019-11-22 Thread Eric Auger

This patch implements virtio_iommu_map/unmap.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- revisit the implementation of unmap according to Peter's suggestion
- removed virt_addr and size from viommu_mapping struct
- use g_tree_lookup_extended()
- return VIRTIO_IOMMU_S_RANGE in case a mapping were
  to be split on unmap (instead of INVAL)

v5 -> v6:
- use new v0.6 fields
- replace error_report by qemu_log_mask

v3 -> v4:
- implement unmap semantics as specified in v0.4
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 65 ++--
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a373bdebb3..f25359cee2 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -65,6 +65,7 @@ virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) 
"domain=%d endpoint=%d"
 virtio_iommu_detach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, 
uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" 
virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
 virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) 
"domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
+virtio_iommu_unmap_done(uint32_t domain_id, uint64_t virt_start, uint64_t 
virt_end) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
 virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int 
flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
 virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
 virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 138d5b2a9c..f0a56833a2 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 #include "qemu/iov.h"
 #include "qemu-common.h"
 #include "hw/qdev-properties.h"
@@ -55,6 +56,11 @@ typedef struct viommu_interval {
 uint64_t high;
 } viommu_interval;
 
+typedef struct viommu_mapping {
+uint64_t phys_addr;
+uint32_t flags;
+} viommu_mapping;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
 return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -238,10 +244,35 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
 uint64_t virt_start = le64_to_cpu(req->virt_start);
 uint64_t virt_end = le64_to_cpu(req->virt_end);
 uint32_t flags = le32_to_cpu(req->flags);
+viommu_domain *domain;
+viommu_interval *interval;
+viommu_mapping *mapping;
+
+interval = g_malloc0(sizeof(*interval));
+
+interval->low = virt_start;
+interval->high = virt_end;
+
+domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+if (!domain) {
+return VIRTIO_IOMMU_S_NOENT;
+}
+
+mapping = g_tree_lookup(domain->mappings, (gpointer)interval);
+if (mapping) {
+g_free(interval);
+return VIRTIO_IOMMU_S_INVAL;
+}
 
 trace_virtio_iommu_map(domain_id, virt_start, virt_end, phys_start, flags);
 
-return VIRTIO_IOMMU_S_UNSUPP;
+mapping = g_malloc0(sizeof(*mapping));
+mapping->phys_addr = phys_start;
+mapping->flags = flags;
+
+g_tree_insert(domain->mappings, interval, mapping);
+
+return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_unmap(VirtIOIOMMU *s,
@@ -250,10 +281,40 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
 uint32_t domain_id = le32_to_cpu(req->domain);
 uint64_t virt_start = le64_to_cpu(req->virt_start);
 uint64_t virt_end = le64_to_cpu(req->virt_end);
+viommu_mapping *iter_val;
+viommu_interval interval, *iter_key;
+viommu_domain *domain;
+int ret = VIRTIO_IOMMU_S_OK;
 
 trace_virtio_iommu_unmap(domain_id, virt_start, virt_end);
 
-return VIRTIO_IOMMU_S_UNSUPP;
+domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+if (!domain) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: no domain\n", __func__);
+return VIRTIO_IOMMU_S_NOENT;
+}
+interval.low = virt_start;
+interval.high = virt_end;
+
+while (g_tree_lookup_extended(domain->mappings, ,
+  (void **)_key, (void**)_val)) {
+uint64_t current_low = iter_key->low;
+uint64_t current_high = iter_key->high;
+
+if (interval.low <= current_low && interval.high >= current_high) {
+g_tree_remove(domain->mappings, iter_key);
+trace_virtio_iommu_unmap_done(domain_id, current_low, 
current_high);
+} else {
+qemu_log_mask(LOG_GUEST_ERROR,
+"%s: domain= %d Unmap [0x%"PRIx64",0x%"PRIx64"] forbidden as "
+"it would split existing mapping [0x%"PRIx64", 0x%"PRIx64"]\n",
+__func__, domain_id, interval.low, interval.high,
+current_low, current_high);
+ret = VIRTIO_IOMMU_S_RANGE;
+

Re: [PATCH v3 8/8] iotests: Test committing to short backing file

2019-11-22 Thread Eric Blake


On 11/22/19 10:05 AM, Kevin Wolf wrote:

Signed-off-by: Kevin Wolf 
---
  tests/qemu-iotests/274| 152 +
  tests/qemu-iotests/274.out| 203 ++
  tests/qemu-iotests/group  |   1 +
  tests/qemu-iotests/iotests.py |   2 +-
  4 files changed, 357 insertions(+), 1 deletion(-)
  create mode 100755 tests/qemu-iotests/274
  create mode 100644 tests/qemu-iotests/274.out




+iotests.log('== Resize tests ==')
+
+# Use different sizes for different allocation modes:
+#
+# We want to have at least one test where 32 bit truncation in the size of
+# the overlapping area becomes visible. This is covered by the
+# prealloc='off' case (1G to 6G is an overlap of 5G).
+#
+# However, we can only do this for modes that don't preallocate data
+# because otherwise we might run out of space on the test host.
+for (prealloc, base_size, top_size_old, top_size_new, off)  in [
+('off',   '6G',  '1G',  '8G',  '5G'),
+('metadata', '32G', '30G', '33G', '31G'),
+('falloc',   '10M',  '5M', '15M',  '9M'),
+('full', '16M',  '8M', '12M', '11M')]:


The changes since v3 make sense.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH for-5.0 v11 05/20] virtio-iommu: Endpoint and domains structs and helpers

2019-11-22 Thread Eric Auger

This patch introduce domain and endpoint internal
datatypes. Both are stored in RB trees. The domain
owns a list of endpoints attached to it.

Helpers to get/put end points and domains are introduced.
get() helpers will become static in subsequent patches.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- fixed interval_cmp (<= -> < and >= -> >)
- removed unused viommu field from endpoint
- removed Bharat's R-b

v9 -> v10:
- added Bharat's R-b

v6 -> v7:
- on virtio_iommu_find_add_as the bus number computation may
  not be finalized yet so we cannot register the EPs at that time.
  Hence, let's remove the get_endpoint and also do not use the
  bus number for building the memory region name string (only
  used for debug though).

v4 -> v5:
- initialize as->endpoint_list

v3 -> v4:
- new separate patch
---
 hw/virtio/trace-events   |   4 ++
 hw/virtio/virtio-iommu.c | 117 +++
 2 files changed, 121 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index b32169d56c..a373bdebb3 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -67,3 +67,7 @@ virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, 
uint64_t virt_end, uin
 virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) 
"domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
 virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int 
flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
 virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
+virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d"
+virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
+virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
+virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 2d7b1752b7..235bde2203 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -32,15 +32,116 @@
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
 #include "hw/virtio/virtio-iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci.h"
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+typedef struct viommu_domain {
+uint32_t id;
+GTree *mappings;
+QLIST_HEAD(, viommu_endpoint) endpoint_list;
+} viommu_domain;
+
+typedef struct viommu_endpoint {
+uint32_t id;
+viommu_domain *domain;
+QLIST_ENTRY(viommu_endpoint) next;
+} viommu_endpoint;
+
+typedef struct viommu_interval {
+uint64_t low;
+uint64_t high;
+} viommu_interval;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
 return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
 }
 
+static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
+{
+viommu_interval *inta = (viommu_interval *)a;
+viommu_interval *intb = (viommu_interval *)b;
+
+if (inta->high < intb->low) {
+return -1;
+} else if (intb->high < inta->low) {
+return 1;
+} else {
+return 0;
+}
+}
+
+static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep)
+{
+QLIST_REMOVE(ep, next);
+ep->domain = NULL;
+}
+
+viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id);
+viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id)
+{
+viommu_endpoint *ep;
+
+ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id));
+if (ep) {
+return ep;
+}
+ep = g_malloc0(sizeof(*ep));
+ep->id = ep_id;
+trace_virtio_iommu_get_endpoint(ep_id);
+g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
+return ep;
+}
+
+static void virtio_iommu_put_endpoint(gpointer data)
+{
+viommu_endpoint *ep = (viommu_endpoint *)data;
+
+if (ep->domain) {
+virtio_iommu_detach_endpoint_from_domain(ep);
+g_tree_unref(ep->domain->mappings);
+}
+
+trace_virtio_iommu_put_endpoint(ep->id);
+g_free(ep);
+}
+
+viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id);
+viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id)
+{
+viommu_domain *domain;
+
+domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+if (domain) {
+return domain;
+}
+domain = g_malloc0(sizeof(*domain));
+domain->id = domain_id;
+domain->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
+   NULL, (GDestroyNotify)g_free,
+   (GDestroyNotify)g_free);
+g_tree_insert(s->domains, GUINT_TO_POINTER(domain_id), domain);
+QLIST_INIT(>endpoint_list);
+trace_virtio_iommu_get_domain(domain_id);
+return domain;
+}
+
+static void virtio_iommu_put_domain(gpointer data)
+{
+viommu_domain *domain = (viommu_domain *)data;
+viommu_endpoint *iter, *tmp;
+
+QLIST_FOREACH_SAFE(iter, >endpoint_list, next, tmp) {
+virtio_iommu_detach_endpoint_from_domain(iter);
+}
+

[PATCH for-5.0 v11 06/20] virtio-iommu: Implement attach/detach command

2019-11-22 Thread Eric Auger

This patch implements the endpoint attach/detach to/from
a domain.

Signed-off-by: Eric Auger 

---
---
 hw/virtio/virtio-iommu.c | 43 
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 235bde2203..138d5b2a9c 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -77,11 +77,12 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, 
gpointer user_data)
 static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep)
 {
 QLIST_REMOVE(ep, next);
+g_tree_unref(ep->domain->mappings);
 ep->domain = NULL;
 }
 
-viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id);
-viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, uint32_t ep_id)
+static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
+  uint32_t ep_id)
 {
 viommu_endpoint *ep;
 
@@ -102,15 +103,14 @@ static void virtio_iommu_put_endpoint(gpointer data)
 
 if (ep->domain) {
 virtio_iommu_detach_endpoint_from_domain(ep);
-g_tree_unref(ep->domain->mappings);
 }
 
 trace_virtio_iommu_put_endpoint(ep->id);
 g_free(ep);
 }
 
-viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id);
-viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id)
+static viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s,
+  uint32_t domain_id)
 {
 viommu_domain *domain;
 
@@ -137,7 +137,6 @@ static void virtio_iommu_put_domain(gpointer data)
 QLIST_FOREACH_SAFE(iter, >endpoint_list, next, tmp) {
 virtio_iommu_detach_endpoint_from_domain(iter);
 }
-g_tree_destroy(domain->mappings);
 trace_virtio_iommu_put_domain(domain->id);
 g_free(domain);
 }
@@ -186,10 +185,27 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
 {
 uint32_t domain_id = le32_to_cpu(req->domain);
 uint32_t ep_id = le32_to_cpu(req->endpoint);
+viommu_domain *domain;
+viommu_endpoint *ep;
 
 trace_virtio_iommu_attach(domain_id, ep_id);
 
-return VIRTIO_IOMMU_S_UNSUPP;
+ep = virtio_iommu_get_endpoint(s, ep_id);
+if (ep->domain) {
+/*
+ * the device is already attached to a domain,
+ * detach it first
+ */
+virtio_iommu_detach_endpoint_from_domain(ep);
+}
+
+domain = virtio_iommu_get_domain(s, domain_id);
+QLIST_INSERT_HEAD(>endpoint_list, ep, next);
+
+ep->domain = domain;
+g_tree_ref(domain->mappings);
+
+return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_detach(VirtIOIOMMU *s,
@@ -197,10 +213,21 @@ static int virtio_iommu_detach(VirtIOIOMMU *s,
 {
 uint32_t domain_id = le32_to_cpu(req->domain);
 uint32_t ep_id = le32_to_cpu(req->endpoint);
+viommu_endpoint *ep;
 
 trace_virtio_iommu_detach(domain_id, ep_id);
 
-return VIRTIO_IOMMU_S_UNSUPP;
+ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id));
+if (!ep) {
+return VIRTIO_IOMMU_S_NOENT;
+}
+
+if (!ep->domain) {
+return VIRTIO_IOMMU_S_INVAL;
+}
+
+virtio_iommu_detach_endpoint_from_domain(ep);
+return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_map(VirtIOIOMMU *s,
-- 
2.20.1

[PATCH for-5.0 v11 04/20] virtio-iommu: Add the iommu regions

2019-11-22 Thread Eric Auger

This patch initializes the iommu memory regions so that
PCIe end point transactions get translated. The translation
function is not yet implemented though.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- use g_hash_table_new_full for allocating as_by_busptr

v9 -> v10:
- remove pc/virt machine headers
- virtio_iommu_find_add_as: mr_index introduced in that patch
  and name properly freed

v6 -> v7:
- use primary_bus
- rebase on new translate proto featuring iommu_idx

v5 -> v6:
- include qapi/error.h
- fix g_hash_table_lookup key in virtio_iommu_find_add_as

v4 -> v5:
- use PCI bus handle as a key
- use get_primary_pci_bus() callback

v3 -> v4:
- add trace_virtio_iommu_init_iommu_mr

v2 -> v3:
- use IOMMUMemoryRegion
- iommu mr name built with BDF
- rename smmu_get_sid into virtio_iommu_get_sid and use PCI_BUILD_BDF
---
 hw/virtio/trace-events   |  2 +
 hw/virtio/virtio-iommu.c | 92 
 include/hw/virtio/virtio-iommu.h |  2 +
 3 files changed, 96 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index c7276116e7..b32169d56c 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -65,3 +65,5 @@ virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) 
"domain=%d endpoint=%d"
 virtio_iommu_detach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, 
uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" 
virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
 virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) 
"domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
+virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int 
flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
+virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index afd6397ac9..2d7b1752b7 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -23,6 +23,8 @@
 #include "hw/qdev-properties.h"
 #include "hw/virtio/virtio.h"
 #include "sysemu/kvm.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "trace.h"
 
 #include "standard-headers/linux/virtio_ids.h"
@@ -34,6 +36,50 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
+{
+return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
+}
+
+static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
+  int devfn)
+{
+VirtIOIOMMU *s = opaque;
+IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+static uint32_t mr_index;
+IOMMUDevice *sdev;
+
+if (!sbus) {
+sbus = g_malloc0(sizeof(IOMMUPciBus) +
+ sizeof(IOMMUDevice *) * IOMMU_PCI_DEVFN_MAX);
+sbus->bus = bus;
+g_hash_table_insert(s->as_by_busptr, bus, sbus);
+}
+
+sdev = sbus->pbdev[devfn];
+if (!sdev) {
+char *name = g_strdup_printf("%s-%d-%d",
+ TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+ mr_index++, devfn);
+sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(IOMMUDevice));
+
+sdev->viommu = s;
+sdev->bus = bus;
+sdev->devfn = devfn;
+
+trace_virtio_iommu_init_iommu_mr(name);
+
+memory_region_init_iommu(>iommu_mr, sizeof(sdev->iommu_mr),
+ TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+ OBJECT(s), name,
+ UINT64_MAX);
+address_space_init(>as,
+   MEMORY_REGION(>iommu_mr), TYPE_VIRTIO_IOMMU);
+g_free(name);
+}
+return >as;
+}
+
 static int virtio_iommu_attach(VirtIOIOMMU *s,
struct virtio_iommu_req_attach *req)
 {
@@ -172,6 +218,27 @@ out:
 }
 }
 
+static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+IOMMUAccessFlags flag,
+int iommu_idx)
+{
+IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+uint32_t sid;
+
+IOMMUTLBEntry entry = {
+.target_as = _space_memory,
+.iova = addr,
+.translated_addr = addr,
+.addr_mask = ~(hwaddr)0,
+.perm = IOMMU_NONE,
+};
+
+sid = virtio_iommu_get_sid(sdev);
+
+trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+return entry;
+}
+
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
 VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
@@ -252,6 +319,15 @@ static void virtio_iommu_device_realize(DeviceState *dev, 
Error **errp)
 virtio_add_feature(>features, VIRTIO_IOMMU_F_MMIO);
 
 qemu_mutex_init(>mutex);
+
+memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
+

[PATCH for-5.0 v11 03/20] virtio-iommu: Decode the command payload

2019-11-22 Thread Eric Auger

This patch adds the command payload decoding and
introduces the functions that will do the actual
command handling. Those functions are not yet implemented.

Signed-off-by: Eric Auger 

---

v10 -> v11:
- use a macro for handle command functions

v9 -> v10:
- make virtio_iommu_handle_* more compact and
  remove get_payload_size

v7 -> v8:
- handle new domain parameter in detach
- remove reserved checks

v5 -> v6:
- change map/unmap semantics (remove size)

v4 -> v5:
- adopt new v0.5 terminology

v3 -> v4:
- no flags field anymore in struct virtio_iommu_req_unmap
- test reserved on attach/detach, change trace proto
- rebase on v2.10.0.
---
 hw/virtio/trace-events   |  4 +++
 hw/virtio/virtio-iommu.c | 76 +---
 2 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index f7dac39213..c7276116e7 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -61,3 +61,7 @@ virtio_iommu_set_features(uint64_t features) "features 
accepted by the driver =0
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
 virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_range=%d probe_size=0x%x"
 virtio_iommu_set_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_bits=%d probe_size=0x%x"
+virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
+virtio_iommu_detach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
+virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, 
uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" 
virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
+virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) 
"domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 7b25db3713..afd6397ac9 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -34,31 +34,83 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
-static int virtio_iommu_handle_attach(VirtIOIOMMU *s,
-  struct iovec *iov,
-  unsigned int iov_cnt)
+static int virtio_iommu_attach(VirtIOIOMMU *s,
+   struct virtio_iommu_req_attach *req)
 {
+uint32_t domain_id = le32_to_cpu(req->domain);
+uint32_t ep_id = le32_to_cpu(req->endpoint);
+
+trace_virtio_iommu_attach(domain_id, ep_id);
+
 return VIRTIO_IOMMU_S_UNSUPP;
 }
-static int virtio_iommu_handle_detach(VirtIOIOMMU *s,
-  struct iovec *iov,
-  unsigned int iov_cnt)
+
+static int virtio_iommu_detach(VirtIOIOMMU *s,
+   struct virtio_iommu_req_detach *req)
 {
+uint32_t domain_id = le32_to_cpu(req->domain);
+uint32_t ep_id = le32_to_cpu(req->endpoint);
+
+trace_virtio_iommu_detach(domain_id, ep_id);
+
 return VIRTIO_IOMMU_S_UNSUPP;
 }
-static int virtio_iommu_handle_map(VirtIOIOMMU *s,
-   struct iovec *iov,
-   unsigned int iov_cnt)
+
+static int virtio_iommu_map(VirtIOIOMMU *s,
+struct virtio_iommu_req_map *req)
 {
+uint32_t domain_id = le32_to_cpu(req->domain);
+uint64_t phys_start = le64_to_cpu(req->phys_start);
+uint64_t virt_start = le64_to_cpu(req->virt_start);
+uint64_t virt_end = le64_to_cpu(req->virt_end);
+uint32_t flags = le32_to_cpu(req->flags);
+
+trace_virtio_iommu_map(domain_id, virt_start, virt_end, phys_start, flags);
+
 return VIRTIO_IOMMU_S_UNSUPP;
 }
-static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
- struct iovec *iov,
- unsigned int iov_cnt)
+
+static int virtio_iommu_unmap(VirtIOIOMMU *s,
+  struct virtio_iommu_req_unmap *req)
 {
+uint32_t domain_id = le32_to_cpu(req->domain);
+uint64_t virt_start = le64_to_cpu(req->virt_start);
+uint64_t virt_end = le64_to_cpu(req->virt_end);
+
+trace_virtio_iommu_unmap(domain_id, virt_start, virt_end);
+
 return VIRTIO_IOMMU_S_UNSUPP;
 }
 
+static int virtio_iommu_iov_to_req(struct iovec *iov,
+   unsigned int iov_cnt,
+   void *req, size_t req_sz)
+{
+size_t sz, payload_sz = req_sz - sizeof(struct virtio_iommu_req_tail);
+
+sz = iov_to_buf(iov, iov_cnt, 0, req, payload_sz);
+if (unlikely(sz != payload_sz)) {
+return VIRTIO_IOMMU_S_INVAL;
+}
+return 0;
+}
+
+#define virtio_iommu_handle_req(__req)

Re: [RFC PATCH-for-5.0] hw/pci-host: Add Kconfig selector for IGD PCIe pass-through

2019-11-22 Thread Thomas Huth

On 22/11/2019 18.22, Philippe Mathieu-Daudé wrote:
> Introduce a kconfig selector to allow builds without Intel
> Integrated Graphics Device GPU PCIe passthrough.
> We keep the default as enabled.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because to be able to use the Kconfig-generated
> "config-devices.h" header we have to move this device
> out of $common-obj and build i440fx.o on a per-target
> basis, which is not optimal...

IMHO you should move the code out of i440fx.o and into a separate file
if possible. That's hopefully cleaner than #ifdeffing here, and you
hopefully only need to move the new code into "obj-" and can keep
i440fx.o in common-obj.

 Thomas

[PATCH for-5.0 v11 00/20] VIRTIO-IOMMU device

2019-11-22 Thread Eric Auger

This series implements the QEMU virtio-iommu device.

This matches the v0.12 spec and the corresponding virtio-iommu
driver upstreamed in 5.3.

The pci proxy for the virtio-iommu device is instantiated using
"-device virtio-iommu-pci". This series still relies on ACPI IORT/DT
integration. Note the ACPI IORT integration is not yet upstreamed
and testing needs to be based on Jean-Philippe's additional
kernel patches [1].

Work is ongoing to remove IORT adherence and allow the
bindings between the IOMMU and the root complex to be defined
and written into the PCI device configuration space. The outcome
of this work is uncertain at this stage though. See [2].

So only patches 1-11 fully rely on upstreamed kernel code. Others
should be considered as RFC.

This respin allows people to test on ARM and x86. It also
brings migration support (tested on ARM) and various cleanups.
Reserved regions are now passed through an array of properties.
A libqos test also is introduced to test the virtio-iommu API.

Note integration with vhost devices and vfio devices is not part
of this series. Please follow Bharat's respins [3].

The 1st Patch ("migration: Support QLIST migration") was sent
separately [4].

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v4.2-rc2-virtio-iommu-v11

[1] kernel branch to be used for guest
https://github.com/eauger/linux/tree/v5.4-rc8-virtio-iommu-iort
[2] [RFC 00/13] virtio-iommu on non-devicetree platforms
[3] VFIO/VHOST integration is not part of this series. Please follow
[PATCH RFC v5 0/5] virtio-iommu: VFIO integration respins
[4] [PATCH v6] migration: Support QLIST migration

Testing:
- tested with guest using virtio-net-pci
  (,vhost=off,iommu_platform,disable-modern=off,disable-legacy=on)
  and virtio-blk-pci
- migration on ARM
- on x86 PC machine I get some AHCI non translated transactions,
  very early. This does not prevent the guest from boot and behaving
  properly. Warnings look like:
qemu-system-x86_64: virtio_iommu_translate sid=250 is not known!!
qemu-system-x86_64: no buffer available in event queue to report event
qemu-system-x86_64: AHCI: Failed to start FIS receive engine: bad FIS
receive buffer address

History:

v10 -> v11:
- introduce virtio_iommu_handle_req macro
- migration support
- introduce DEFINE_PROP_INTERVAL and pass reserved regions
  through an array of those
- domain gtree simplification

v9 -> v10:
- rebase on 4.1.0-rc2, compliance with 0.12 spec
- removed ACPI part
- cleanup (see individual change logs)
- moved to a PATCH series

v8 -> v9:
- virtio-iommu-pci device needs to be instantiated from the command
  line (RID is not imposed anymore).
- tail structure properly initialized

v7 -> v8:
- virtio-iommu-pci added
- virt instantiation modified
- DT and ACPI modified to exclude the iommu RID from the mapping
- VIRTIO_IOMMU_F_BYPASS, VIRTIO_F_VERSION_1 features exposed

v6 -> v7:
- rebase on qemu 3.0.0-rc3
- minor update against v0.7
- fix issue with EP not on pci.0 and ACPI probing
- change the instantiation method

v5 -> v6:
- minor update against v0.6 spec
- fix g_hash_table_lookup in virtio_iommu_find_add_as
- replace some error_reports by qemu_log_mask(LOG_GUEST_ERROR, ...)

v4 -> v5:
- event queue and fault reporting
- we now return the IOAPIC MSI region if the virtio-iommu is instantiated
  in a PC machine.
- we bypass transactions on MSI HW region and fault on reserved ones.
- We support ACPI boot with mach-virt (based on IORT proposal)
- We moved to the new driver naming conventions
- simplified mach-virt instantiation
- worked around the disappearing of pci_find_primary_bus
- in virtio_iommu_translate, check the dev->as is not NULL
- initialize as->device_list in virtio_iommu_get_as
- initialize bufstate.error to false in virtio_iommu_probe

v3 -> v4:
- probe request support although no reserved region is returned at
  the moment
- unmap semantics less strict, as specified in v0.4
- device registration, attach/detach revisited
- split into smaller patches to ease review
- propose a way to inform the IOMMU mr about the page_size_mask
  of underlying HW IOMMU, if any
- remove warning associated with the translation of the MSI doorbell

v2 -> v3:
- rebase on top of 2.10-rc0 and especially
  [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
- add mutex init
- fix as->mappings deletion using g_tree_ref/unref
- when a dev is attached whereas it is already attached to
  another address space, first detach it
- fix some error values
- page_sizes = TARGET_PAGE_MASK;
- I haven't changed the unmap() semantics yet, waiting for the
  next virtio-iommu spec revision.

v1 -> v2:
- fix redefinition of viommu_as typedef



Eric Auger (20):
  migration: Support QLIST migration
  virtio-iommu: Add skeleton
  virtio-iommu: Decode the command payload
  virtio-iommu: Add the iommu regions
  virtio-iommu: Endpoint and domains structs and helpers
  virtio-iommu: Implement attach/detach command
  virtio-iommu: Implement

[PATCH for-5.0 v11 01/20] migration: Support QLIST migration

2019-11-22 Thread Eric Auger

Support QLIST migration using the same principle as QTAILQ:
94869d5c52 ("migration: migrate QTAILQ").

The VMSTATE_QLIST_V macro has the same proto as VMSTATE_QTAILQ_V.
The change mainly resides in QLIST RAW macros: QLIST_RAW_INSERT_HEAD
and QLIST_RAW_REVERSE.

Tests also are provided.

Signed-off-by: Eric Auger 

---

v5 - v6:
- by doing more advanced testing with virtio-iommu migration
  I noticed this was broken. "prev" field was not set properly.
  I improved the tests to manipulate both the next and prev
  fields.
- Removed Peter and Juan's R-b
---
 include/migration/vmstate.h |  21 +
 include/qemu/queue.h|  39 +
 migration/trace-events  |   5 ++
 migration/vmstate-types.c   |  70 +++
 tests/test-vmstate.c| 170 
 5 files changed, 305 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index ac4f46a67d..08683d93c6 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -227,6 +227,7 @@ extern const VMStateInfo vmstate_info_tmp;
 extern const VMStateInfo vmstate_info_bitmap;
 extern const VMStateInfo vmstate_info_qtailq;
 extern const VMStateInfo vmstate_info_gtree;
+extern const VMStateInfo vmstate_info_qlist;
 
 #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0)
 /*
@@ -796,6 +797,26 @@ extern const VMStateInfo vmstate_info_gtree;
 .offset   = offsetof(_state, _field),  
\
 }
 
+/*
+ * For migrating a QLIST
+ * Target QLIST needs be properly initialized.
+ * _type: type of QLIST element
+ * _next: name of QLIST_ENTRY entry field in QLIST element
+ * _vmsd: VMSD for QLIST element
+ * size: size of QLIST element
+ * start: offset of QLIST_ENTRY in QTAILQ element
+ */
+#define VMSTATE_QLIST_V(_field, _state, _version, _vmsd, _type, _next)  \
+{\
+.name = (stringify(_field)), \
+.version_id   = (_version),  \
+.vmsd = &(_vmsd),\
+.size = sizeof(_type),   \
+.info = _info_qlist, \
+.offset   = offsetof(_state, _field),\
+.start= offsetof(_type, _next),  \
+}
+
 /* _f : field name
_f_n : num of elements field_name
_n : num of elements
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 4764d93ea3..4d4554a7ce 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -501,4 +501,43 @@ union {
 \
 QTAILQ_RAW_TQH_CIRC(head)->tql_prev = QTAILQ_RAW_TQE_CIRC(elm, entry); 
 \
 } while (/*CONSTCOND*/0)
 
+#define QLIST_RAW_FIRST(head)  
\
+field_at_offset(head, 0, void *)
+
+#define QLIST_RAW_NEXT(elm, entry) 
\
+field_at_offset(elm, entry, void *)
+
+#define QLIST_RAW_PREVIOUS(elm, entry) 
\
+field_at_offset(elm, entry + sizeof(void *), void *)
+
+#define QLIST_RAW_FOREACH(elm, head, entry)
\
+for ((elm) = *QLIST_RAW_FIRST(head);   
\
+ (elm);
\
+ (elm) = *QLIST_RAW_NEXT(elm, entry))
+
+#define QLIST_RAW_INSERT_HEAD(head, elm, entry) do {   
\
+void *first = *QLIST_RAW_FIRST(head);  
\
+*QLIST_RAW_FIRST(head) = elm;  
\
+*QLIST_RAW_PREVIOUS(elm, entry) = QLIST_RAW_FIRST(head);   
\
+if (first) {   
\
+*QLIST_RAW_NEXT(elm, entry) = first;   
\
+*QLIST_RAW_PREVIOUS(first, entry) = QLIST_RAW_NEXT(elm, entry);
\
+} else {   
\
+*QLIST_RAW_NEXT(elm, entry) = NULL;
\
+}  
\
+} while (0)
+
+#define QLIST_RAW_REVERSE(head, elm, entry) do {   
\
+void *iter = *QLIST_RAW_FIRST(head), *prev = NULL, *next;  
\
+while (iter) { 
\
+next = *QLIST_RAW_NEXT(iter, entry);   
\
+*QLIST_RAW_PREVIOUS(iter, entry) = QLIST_RAW_NEXT(next, entry);
\
+*QLIST_RAW_NEXT(iter, entry) = prev;   
\
+prev = iter;

Re: [PATCH 6/6] travis.yml: Enable builds on arm64, ppc64le and s390x

2019-11-22 Thread Alex Bennée



Thomas Huth  writes:

> Travis recently added the possibility to test on these architectures,
> too, so let's enable them in our travis.yml file to extend our test
> coverage.

This is good as far as it goes but it would be nice to exercise the
respective TCG backends. If added two commits to:

  https://github.com/stsquad/qemu/commits/review/multiarch-testing

which allow for that. I'll know if they worked properly in a hour or two
once the testing has finished.

>
> Unfortunately, the libssh in this Ubuntu version (bionic) is in a pretty
> unusable Frankenstein state and libspice-server-dev is not available here,
> so we can not use the global list of packages to install, but have to
> provide individual package lists instead.
>
> Signed-off-by: Thomas Huth 
> ---
>  .travis.yml | 83 +
>  1 file changed, 83 insertions(+)
>
> diff --git a/.travis.yml b/.travis.yml
> index c09b6a0014..cf48ee452c 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -360,6 +360,89 @@ matrix:
>  - TEST_CMD="make -j3 check-tcg V=1"
>  - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
>  
> +- arch: arm64
> +  addons:
> +apt_packages:
> +  - libaio-dev
> +  - libattr1-dev
> +  - libbrlapi-dev
> +  - libcap-dev
> +  - libcap-ng-dev
> +  - libgcrypt20-dev
> +  - libgnutls28-dev
> +  - libgtk-3-dev
> +  - libiscsi-dev
> +  - liblttng-ust-dev
> +  - libncurses5-dev
> +  - libnfs-dev
> +  - libnss3-dev
> +  - libpixman-1-dev
> +  - libpng-dev
> +  - librados-dev
> +  - libsdl2-dev
> +  - libseccomp-dev
> +  - liburcu-dev
> +  - libusb-1.0-0-dev
> +  - libvdeplug-dev
> +  - libvte-2.91-dev
> +  env:
> +- CONFIG="--target-list=${MAIN_SOFTMMU_TARGETS},x86_64-linux-user"
> +
> +- arch: ppc64le
> +  addons:
> +apt_packages:
> +  - libaio-dev
> +  - libattr1-dev
> +  - libbrlapi-dev
> +  - libcap-dev
> +  - libcap-ng-dev
> +  - libgcrypt20-dev
> +  - libgnutls28-dev
> +  - libgtk-3-dev
> +  - libiscsi-dev
> +  - liblttng-ust-dev
> +  - libncurses5-dev
> +  - libnfs-dev
> +  - libnss3-dev
> +  - libpixman-1-dev
> +  - libpng-dev
> +  - librados-dev
> +  - libsdl2-dev
> +  - libseccomp-dev
> +  - liburcu-dev
> +  - libusb-1.0-0-dev
> +  - libvdeplug-dev
> +  - libvte-2.91-dev
> +  env:
> +- CONFIG="--target-list=${MAIN_SOFTMMU_TARGETS},x86_64-linux-user"
> +
> +- arch: s390x
> +  addons:
> +apt_packages:
> +  - libaio-dev
> +  - libattr1-dev
> +  - libbrlapi-dev
> +  - libcap-dev
> +  - libcap-ng-dev
> +  - libgcrypt20-dev
> +  - libgnutls28-dev
> +  - libgtk-3-dev
> +  - libiscsi-dev
> +  - liblttng-ust-dev
> +  - libncurses5-dev
> +  - libnfs-dev
> +  - libnss3-dev
> +  - libpixman-1-dev
> +  - libpng-dev
> +  - librados-dev
> +  - libsdl2-dev
> +  - libseccomp-dev
> +  - liburcu-dev
> +  - libusb-1.0-0-dev
> +  - libvdeplug-dev
> +  - libvte-2.91-dev
> +  env:
> +- CONFIG="--target-list=${MAIN_SOFTMMU_TARGETS},x86_64-linux-user"
>  
>  # Release builds
>  # The make-release script expect a QEMU version, so our tag must start 
> with a 'v'.


-- 
Alex Bennée

Re: [PATCH v2 4/5] s390x: Move clear reset

2019-11-22 Thread David Hildenbrand

On 22.11.19 18:15, Janosch Frank wrote:
> On 11/22/19 3:30 PM, David Hildenbrand wrote:
>> On 22.11.19 15:00, Janosch Frank wrote:
>>> Let's also move the clear reset function into the reset handler.
>>>
>>> Signed-off-by: Janosch Frank 
>>> ---
>>>  target/s390x/cpu-qom.h |  1 +
>>>  target/s390x/cpu.c | 50 --
>>>  2 files changed, 10 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/target/s390x/cpu-qom.h b/target/s390x/cpu-qom.h
>>> index 6f0a12042e..dbe5346ec9 100644
>>> --- a/target/s390x/cpu-qom.h
>>> +++ b/target/s390x/cpu-qom.h
>>> @@ -37,6 +37,7 @@ typedef struct S390CPUDef S390CPUDef;
>>>  typedef enum cpu_reset_type {
>>>  S390_CPU_RESET_NORMAL,
>>>  S390_CPU_RESET_INITIAL,
>>> +S390_CPU_RESET_CLEAR,
>>>  } cpu_reset_type;
>>>  
>>>  /**
>>> diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
>>> index 1f423fb676..017181fe4a 100644
>>> --- a/target/s390x/cpu.c
>>> +++ b/target/s390x/cpu.c
>>> @@ -94,6 +94,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>>> type)
>>>  s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>>>  
>>>  switch (type) {
>>> +case S390_CPU_RESET_CLEAR:
>>> +memset(env, 0, offsetof(CPUS390XState, 
>>> start_initial_reset_fields));
>>
>> I think the preferred term in QEMU is "fall through".
>>
>>> +/* Fallthrough */
>>>  case S390_CPU_RESET_INITIAL:
>>>  /* initial reset does not clear everything! */
>>>  memset(>start_initial_reset_fields, 0,
>>> @@ -121,46 +124,6 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>>> type)
>>>  }
>>>  }
>>>  
>>> -/* CPUClass:reset() */
>>> -static void s390_cpu_full_reset(CPUState *s)
>>> -{
>>> -S390CPU *cpu = S390_CPU(s);
>>> -S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
>>> -CPUS390XState *env = >env;
>>> -
>>> -scc->parent_reset(s);
>>> -cpu->env.sigp_order = 0;
>>> -s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>>> -
>>> -memset(env, 0, offsetof(CPUS390XState, end_reset_fields));
>>> -
>>> -/* architectured initial values for CR 0 and 14 */
>>> -env->cregs[0] = CR0_RESET;
>>> -env->cregs[14] = CR14_RESET;
>>> -
>>> -#if defined(CONFIG_USER_ONLY)
>>> -/* user mode should always be allowed to use the full FPU */
>>> -env->cregs[0] |= CR0_AFP;
>>> -if (s390_has_feat(S390_FEAT_VECTOR)) {
>>> -env->cregs[0] |= CR0_VECTOR;
>>> -}
>>> -#endif
>>
>> Huh, what happened to that change?
> 
> Btw., wouldn't we need that for both initial and clear reset?

user-only only does a cpu reset when starting up to initialize the cpu.
no other resets will be triggered.
-- 

Thanks,

David / dhildenb

Re: [PATCH for-5.0 v5 00/23] ppc/pnv: add XIVE support for KVM guests

2019-11-22 Thread Cédric Le Goater

On 15/11/2019 17:24, Cédric Le Goater wrote:
> Hello,
> 
> The QEMU PowerNV machine emulates a baremetal OpenPOWER system and
> acts as an hypervisor (L0). Supporting emulation of KVM to run guests
> (L1) requires a few more extensions, among which guest support for the
> XIVE interrupt controller on POWER9 processor.
> 
> The following changes extend the XIVE models with the new XiveFabric
> and XivePresenter interfaces to provide support for XIVE escalations
> and interrupt resend. This mechanism is used by XIVE to notify the
> hypervisor that a vCPU is not dispatched on a HW thread. Tested on a
> QEMU PowerNV machine and a simple QEMU pseries guest doing network on
> a local bridge.
> 
> The XIVE interrupt controller offers a way to increase the XIVE
> resources per chip by configuring multiple XIVE blocks on a chip. This
> is not currently supported by the model. However, some configurations,
> such as OPAL/skiboot, use one block-per-chip configuration with some
> optimizations. One of them is to override the hardwired chip ID by the
> block id in the PowerBUS operations and for CAM line compares. This
> patchset improves the support for this setup. Tested with 4 chips.
> 
> A series from Suraj adding guest support in the Radix MMU model of the
> QEMU PowerNV machine is still required and will be send later. The
> whole patchset can be found under :
> 
>   https://github.com/legoater/qemu/tree/powernv-4.2


 [ ... ]

> Cédric Le Goater (23):
>   ppc/xive: Record the IPB in the associated NVT
>   ppc/xive: Introduce helpers for the NVT id
>   ppc/pnv: Remove pnv_xive_vst_size() routine
>   ppc/pnv: Dump the XIVE NVT table
>   ppc/pnv: Quiesce some XIVE errors
>   ppc/xive: Introduce OS CAM line helpers
>   ppc/xive: Check V bit in TM_PULL_POOL_CTX
>   ppc/xive: Introduce a XivePresenter interface
>   ppc/xive: Implement the XivePresenter interface

David,

I have reworked the following patches to address Greg's comments 
and your comment on "ppc/pnv: Dump the XIVE NVT table". 

Shall I wait for some feedback from you or just resend ? 

Thanks,

C.

>   ppc/pnv: Loop on the threads of the chip to find a matching NVT
>   ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper
>   ppc/xive: Introduce a XiveFabric interface
>   ppc/pnv: Implement the XiveFabric interface
>   ppc/spapr: Implement the XiveFabric interface
>   ppc/xive: Use the XiveFabric and XivePresenter interfaces
>   ppc/xive: Extend the TIMA operation with a XivePresenter parameter
>   ppc/pnv: Clarify how the TIMA is accessed on a multichip system
>   ppc/xive: Move the TIMA operations to the controller model
>   ppc/xive: Remove the get_tctx() XiveRouter handler
>   ppc/xive: Introduce a xive_tctx_ipb_update() helper
>   ppc/xive: Synthesize interrupt from the saved IPB in the NVT
>   ppc/pnv: Introduce a pnv_xive_block_id() helper
>   ppc/pnv: Extend XiveRouter with a get_block_id() handler
> 
>  include/hw/ppc/pnv.h   |  15 ++
>  include/hw/ppc/pnv_xive.h  |   3 -
>  include/hw/ppc/xive.h  |  72 ++--
>  include/hw/ppc/xive_regs.h |  24 +++
>  hw/intc/pnv_xive.c | 360 -
>  hw/intc/spapr_xive.c   |  88 -
>  hw/intc/xive.c | 350 
>  hw/ppc/pnv.c   |  37 +++-
>  hw/ppc/spapr.c |  36 
>  9 files changed, 691 insertions(+), 294 deletions(-)
>

Re: [PATCH 2/4] virtiofd: Create a notification queue

2019-11-22 Thread Dr. David Alan Gilbert

* Vivek Goyal (vgo...@redhat.com) wrote:
> On Fri, Nov 22, 2019 at 10:19:03AM +, Stefan Hajnoczi wrote:
> > On Fri, Nov 15, 2019 at 03:55:41PM -0500, Vivek Goyal wrote:
> > >  /* Callback from libvhost-user */
> > >  static void fv_set_features(VuDev *dev, uint64_t features)
> > >  {
> > > +struct fv_VuDev *vud = container_of(dev, struct fv_VuDev, dev);
> > > +struct fuse_session *se = vud->se;
> > > +
> > > +if ((1 << VIRTIO_FS_F_NOTIFICATION) & features) {
> > 
> > For consistency 1ull should be used.  That way the reader does not have
> > to check the bit position to verify that the bitmap isn't truncated at
> > 32 bits.
> 
> Ok, will do.
> 
> > 
> > > +vud->notify_enabled = true;
> > > +se->notify_enabled = true;
> > 
> > Only one copy of this field is needed.  vud has a pointer to se.
> 
> I need to access ->notify_enabled in passthrough_ll.c to determine if
> notification queue is enabled or not. That determines if async locks are
> supported or not.  And based on that either -EOPNOTSUPP is returned or
> a response to wait is returned.
> 
> I did not see passthrough_ll.c accessing vud. I did see it having access
> to session object though. So I created a copy there.
> 
> But I am open to suggestions on what's the best way to access this
> information in passthrough_ll.c
> 
> > 
> > > +}
> > >  }
> > >  
> > >  /*
> > > @@ -662,6 +671,65 @@ static void fv_queue_worker(gpointer data, gpointer 
> > > user_data)
> > >  free(req);
> > >  }
> > >  
> > > +static void *fv_queue_notify_thread(void *opaque)
> > > +{
> > > +struct fv_QueueInfo *qi = opaque;
> > > +
> > > +fuse_log(FUSE_LOG_INFO, "%s: Start for queue %d kick_fd %d\n", 
> > > __func__,
> > > + qi->qidx, qi->kick_fd);
> > > +
> > > +while (1) {
> > > +struct pollfd pf[2];
> > > +
> > > +pf[0].fd = qi->kick_fd;
> > > +pf[0].events = POLLIN;
> > > +pf[0].revents = 0;
> > > +pf[1].fd = qi->kill_fd;
> > > +pf[1].events = POLLIN;
> > > +pf[1].revents = 0;
> > > +
> > > +fuse_log(FUSE_LOG_DEBUG, "%s: Waiting for Queue %d event\n", 
> > > __func__,
> > > + qi->qidx);
> > > +int poll_res = ppoll(pf, 2, NULL, NULL);
> > > +
> > > +if (poll_res == -1) {
> > > +if (errno == EINTR) {
> > > +fuse_log(FUSE_LOG_INFO, "%s: ppoll interrupted, going 
> > > around\n",
> > > + __func__);
> > > +continue;
> > > +}
> > > +fuse_log(FUSE_LOG_ERR, "fv_queue_thread ppoll: %m\n");
> > > +break;
> > > +}
> > > +assert(poll_res >= 1);
> > > +if (pf[0].revents & (POLLERR | POLLHUP | POLLNVAL)) {
> > > +fuse_log(FUSE_LOG_ERR, "%s: Unexpected poll revents %x Queue 
> > > %d\n",
> > > + __func__, pf[0].revents, qi->qidx);
> > > + break;
> > > +}
> > > +if (pf[1].revents & (POLLERR | POLLHUP | POLLNVAL)) {
> > > +fuse_log(FUSE_LOG_ERR, "%s: Unexpected poll revents %x Queue 
> > > %d"
> > > + "killfd\n", __func__, pf[1].revents, qi->qidx);
> > > +break;
> > > +}
> > > +if (pf[1].revents) {
> > > +fuse_log(FUSE_LOG_INFO, "%s: kill event on queue %d - 
> > > quitting\n",
> > > + __func__, qi->qidx);
> > > +break;
> > > +}
> > > +assert(pf[0].revents & POLLIN);
> > > +fuse_log(FUSE_LOG_DEBUG, "%s: Got queue event on Queue %d\n", 
> > > __func__,
> > > + qi->qidx);
> > > +
> > > +eventfd_t evalue;
> > > +if (eventfd_read(qi->kick_fd, )) {
> > > +fuse_log(FUSE_LOG_ERR, "Eventfd_read for queue: %m\n");
> > > +break;
> > > +}
> > > +}
> > > +return NULL;
> > > +}
> > 
> > It's difficult to review function without any actual functionality using
> > the virtqueue.  I'm not sure a thread is even needed since the device
> > only needs to get a buffer when it has a notification for the driver.
> > I'll have to wait for the following patches to see what happens here...
> 
> This might very well be redundant. I am not sure. Can get rid of
> this thread if not needed at all. So we don't need to monitor even
> kill_fd and take any special action?

The kill_fd is internal to virtiofsd; it's only used as a way for the
main thread to cause the queue thread to exit;  if you've not got the
thread, you don't need the kill_fd.

Dave

> > 
> > > @@ -378,12 +382,23 @@ static void vuf_set_status(VirtIODevice *vdev, 
> > > uint8_t status)
> > >  }
> > >  }
> > >  
> > > -static uint64_t vuf_get_features(VirtIODevice *vdev,
> > > -  uint64_t requested_features,
> > > -  Error **errp)
> > > +static uint64_t vuf_get_features(VirtIODevice *vdev, uint64_t features,
> > > +

[PATCH] linux-user: fix translation of statx structures

2019-11-22 Thread Ariadne Conill

All timestamps were copied to atime instead of to their respective
fields.

Signed-off-by: Ariadne Conill 
---
 linux-user/syscall.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ce399a55f0..171c0caef3 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6743,12 +6743,12 @@ static inline abi_long host_to_target_statx(struct 
target_statx *host_stx,
 __put_user(host_stx->stx_attributes_mask, 
_stx->stx_attributes_mask);
 __put_user(host_stx->stx_atime.tv_sec, _stx->stx_atime.tv_sec);
 __put_user(host_stx->stx_atime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_btime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_atime.tv_nsec);
-__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_atime.tv_sec);
-__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_atime.tv_nsec);
+__put_user(host_stx->stx_btime.tv_sec, _stx->stx_btime.tv_sec);
+__put_user(host_stx->stx_btime.tv_nsec, _stx->stx_btime.tv_nsec);
+__put_user(host_stx->stx_ctime.tv_sec, _stx->stx_ctime.tv_sec);
+__put_user(host_stx->stx_ctime.tv_nsec, _stx->stx_ctime.tv_nsec);
+__put_user(host_stx->stx_mtime.tv_sec, _stx->stx_mtime.tv_sec);
+__put_user(host_stx->stx_mtime.tv_nsec, _stx->stx_mtime.tv_nsec);
 __put_user(host_stx->stx_rdev_major, _stx->stx_rdev_major);
 __put_user(host_stx->stx_rdev_minor, _stx->stx_rdev_minor);
 __put_user(host_stx->stx_dev_major, _stx->stx_dev_major);
-- 
2.24.0

[RFC PATCH-for-5.0] hw/pci-host: Add Kconfig selector for IGD PCIe pass-through

2019-11-22 Thread Philippe Mathieu-Daudé

Introduce a kconfig selector to allow builds without Intel
Integrated Graphics Device GPU PCIe passthrough.
We keep the default as enabled.

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC because to be able to use the Kconfig-generated
"config-devices.h" header we have to move this device
out of $common-obj and build i440fx.o on a per-target
basis, which is not optimal...
---
 hw/pci-host/i440fx.c  | 9 -
 hw/vfio/pci-quirks.c  | 6 ++
 hw/pci-host/Kconfig   | 5 +
 hw/pci-host/Makefile.objs | 2 +-
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
index f27131102d..41e93581f4 100644
--- a/hw/pci-host/i440fx.c
+++ b/hw/pci-host/i440fx.c
@@ -34,6 +34,7 @@
 #include "hw/pci-host/pam.h"
 #include "qapi/visitor.h"
 #include "qemu/error-report.h"
+#include "config-devices.h"
 
 /*
  * I440FX chipset data sheet.
@@ -386,6 +387,8 @@ static const TypeInfo i440fx_info = {
 },
 };
 
+#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
+
 /* IGD Passthrough Host Bridge. */
 typedef struct {
 uint8_t offset;
@@ -470,6 +473,8 @@ static const TypeInfo igd_passthrough_i440fx_info = {
 .class_init= igd_passthrough_i440fx_class_init,
 };
 
+#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
+
 static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
 PCIBus *rootbus)
 {
@@ -514,8 +519,10 @@ static const TypeInfo i440fx_pcihost_info = {
 static void i440fx_register_types(void)
 {
 type_register_static(_info);
-type_register_static(_passthrough_i440fx_info);
 type_register_static(_pcihost_info);
+#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
+type_register_static(_passthrough_i440fx_info);
+#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
 }
 
 type_init(i440fx_register_types)
diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 136f3a9ad6..858148fa39 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1166,6 +1166,8 @@ static void vfio_probe_rtl8168_bar2_quirk(VFIOPCIDevice 
*vdev, int nr)
 trace_vfio_quirk_rtl8168_probe(vdev->vbasedev.name);
 }
 
+#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
+
 /*
  * Intel IGD support
  *
@@ -1811,6 +1813,8 @@ out:
 g_free(lpc);
 }
 
+#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
+
 /*
  * Common quirk probe entry points.
  */
@@ -1860,7 +1864,9 @@ void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr)
 vfio_probe_nvidia_bar5_quirk(vdev, nr);
 vfio_probe_nvidia_bar0_quirk(vdev, nr);
 vfio_probe_rtl8168_bar2_quirk(vdev, nr);
+#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
 vfio_probe_igd_bar4_quirk(vdev, nr);
+#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
 }
 
 void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr)
diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig
index b0aa8351c4..0b7539765a 100644
--- a/hw/pci-host/Kconfig
+++ b/hw/pci-host/Kconfig
@@ -1,6 +1,10 @@
 config PAM
 bool
 
+config INTEL_IGD_PASSTHROUGH
+default y
+bool
+
 config PREP_PCI
 bool
 select PCI
@@ -32,6 +36,7 @@ config PCI_I440FX
 bool
 select PCI
 select PAM
+imply INTEL_IGD_PASSTHROUGH
 
 config PCI_EXPRESS_Q35
 bool
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index efd752b766..3c925192dd 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -13,7 +13,7 @@ common-obj-$(CONFIG_VERSATILE_PCI) += versatile.o
 
 common-obj-$(CONFIG_PCI_SABRE) += sabre.o
 common-obj-$(CONFIG_FULONG) += bonito.o
-common-obj-$(CONFIG_PCI_I440FX) += i440fx.o
+obj-$(CONFIG_PCI_I440FX) += i440fx.o
 common-obj-$(CONFIG_PCI_EXPRESS_Q35) += q35.o
 common-obj-$(CONFIG_PCI_EXPRESS_GENERIC_BRIDGE) += gpex.o
 common-obj-$(CONFIG_PCI_EXPRESS_XILINX) += xilinx-pcie.o
-- 
2.21.0

Re: [PATCH v3 4/8] block: truncate: Don't make backing file data visible

2019-11-22 Thread Eric Blake


On 11/22/19 10:05 AM, Kevin Wolf wrote:

When extending the size of an image that has a backing file larger than
its old size, make sure that the backing file data doesn't become
visible in the guest, but the added area is properly zeroed out.

Consider the following scenario where the overlay is shorter than its
backing file:

 base.qcow2: 
 overlay.qcow2:  

When resizing (extending) overlay.qcow2, the new blocks should not stay
unallocated and make the additional As from base.qcow2 visible like
before this patch, but zeros should be read.

A similar case happens with the various variants of a commit job when an
intermediate file is short (- for unallocated):

 base.qcow2: A-A-
 mid.qcow2:  BB-B
 top.qcow2:  C--C--C-

After commit top.qcow2 to mid.qcow2, the following happens:

 mid.qcow2:  CB-C00C0 (correct result)
 mid.qcow2:  CB-C--C- (before this fix)

Without the fix, blocks that previously read as zeros on top.qcow2
suddenly turn into A.

Signed-off-by: Kevin Wolf 
---
  block/io.c | 33 +
  1 file changed, 33 insertions(+)


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 1/5] s390x: Don't do a normal reset on the initial cpu

2019-11-22 Thread Janosch Frank

On 11/22/19 5:17 PM, Cornelia Huck wrote:
> On Fri, 22 Nov 2019 08:59:58 -0500
> Janosch Frank  wrote:
> 
>> The initiating cpu needs to be reset with an initial reset. While
>> doing a normal reset followed by a initial reset is not wron per-se,
> 
> s/wron per-se/wrong per se/

Ups

> 
>> the Ultravisor will only allow the correct reset to be performed.
> 
> So... the uv has stricter rules than the architecture has in that
> respect?

Yeah, the architecture only cares about the state that the cpu will be
in after the reset. So we can do as many changes to vcpu run as we like,
if at the end we are in the reset state we intended to be in.

The UV guards all resets including the sigp initiated ones.

> 
>>
>> Signed-off-by: Janosch Frank 
>> Reviewed-by: David Hildenbrand 
>> ---
>>  hw/s390x/s390-virtio-ccw.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index d3edeef0ad..c1d1440272 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -348,6 +348,9 @@ static void s390_machine_reset(MachineState *machine)
>>  break;
>>  case S390_RESET_LOAD_NORMAL:
>>  CPU_FOREACH(t) {
>> +if (t == cs) {
>> +continue;
>> +}
>>  run_on_cpu(t, s390_do_cpu_reset, RUN_ON_CPU_NULL);
>>  }
>>  subsystem_reset();
> 




signature.asc
Description: OpenPGP digital signature

[PATCH v6] migration: Support QLIST migration

2019-11-22 Thread Eric Auger

Support QLIST migration using the same principle as QTAILQ:
94869d5c52 ("migration: migrate QTAILQ").

The VMSTATE_QLIST_V macro has the same proto as VMSTATE_QTAILQ_V.
The change mainly resides in QLIST RAW macros: QLIST_RAW_INSERT_HEAD
and QLIST_RAW_REVERSE.

Tests also are provided.

Signed-off-by: Eric Auger 

---

v5 - v6:
- by doing more advanced testing with virtio-iommu migration
  I noticed this was broken. "prev" field was not set properly.
  I improved the tests to manipulate both the next and prev
  fields.
- Removed Peter and Juan's R-b
---
 include/migration/vmstate.h |  21 +
 include/qemu/queue.h|  39 +
 migration/trace-events  |   5 ++
 migration/vmstate-types.c   |  70 +++
 tests/test-vmstate.c| 170 
 5 files changed, 305 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index ac4f46a67d..08683d93c6 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -227,6 +227,7 @@ extern const VMStateInfo vmstate_info_tmp;
 extern const VMStateInfo vmstate_info_bitmap;
 extern const VMStateInfo vmstate_info_qtailq;
 extern const VMStateInfo vmstate_info_gtree;
+extern const VMStateInfo vmstate_info_qlist;
 
 #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0)
 /*
@@ -796,6 +797,26 @@ extern const VMStateInfo vmstate_info_gtree;
 .offset   = offsetof(_state, _field),  
\
 }
 
+/*
+ * For migrating a QLIST
+ * Target QLIST needs be properly initialized.
+ * _type: type of QLIST element
+ * _next: name of QLIST_ENTRY entry field in QLIST element
+ * _vmsd: VMSD for QLIST element
+ * size: size of QLIST element
+ * start: offset of QLIST_ENTRY in QTAILQ element
+ */
+#define VMSTATE_QLIST_V(_field, _state, _version, _vmsd, _type, _next)  \
+{\
+.name = (stringify(_field)), \
+.version_id   = (_version),  \
+.vmsd = &(_vmsd),\
+.size = sizeof(_type),   \
+.info = _info_qlist, \
+.offset   = offsetof(_state, _field),\
+.start= offsetof(_type, _next),  \
+}
+
 /* _f : field name
_f_n : num of elements field_name
_n : num of elements
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 4764d93ea3..4d4554a7ce 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -501,4 +501,43 @@ union {
 \
 QTAILQ_RAW_TQH_CIRC(head)->tql_prev = QTAILQ_RAW_TQE_CIRC(elm, entry); 
 \
 } while (/*CONSTCOND*/0)
 
+#define QLIST_RAW_FIRST(head)  
\
+field_at_offset(head, 0, void *)
+
+#define QLIST_RAW_NEXT(elm, entry) 
\
+field_at_offset(elm, entry, void *)
+
+#define QLIST_RAW_PREVIOUS(elm, entry) 
\
+field_at_offset(elm, entry + sizeof(void *), void *)
+
+#define QLIST_RAW_FOREACH(elm, head, entry)
\
+for ((elm) = *QLIST_RAW_FIRST(head);   
\
+ (elm);
\
+ (elm) = *QLIST_RAW_NEXT(elm, entry))
+
+#define QLIST_RAW_INSERT_HEAD(head, elm, entry) do {   
\
+void *first = *QLIST_RAW_FIRST(head);  
\
+*QLIST_RAW_FIRST(head) = elm;  
\
+*QLIST_RAW_PREVIOUS(elm, entry) = QLIST_RAW_FIRST(head);   
\
+if (first) {   
\
+*QLIST_RAW_NEXT(elm, entry) = first;   
\
+*QLIST_RAW_PREVIOUS(first, entry) = QLIST_RAW_NEXT(elm, entry);
\
+} else {   
\
+*QLIST_RAW_NEXT(elm, entry) = NULL;
\
+}  
\
+} while (0)
+
+#define QLIST_RAW_REVERSE(head, elm, entry) do {   
\
+void *iter = *QLIST_RAW_FIRST(head), *prev = NULL, *next;  
\
+while (iter) { 
\
+next = *QLIST_RAW_NEXT(iter, entry);   
\
+*QLIST_RAW_PREVIOUS(iter, entry) = QLIST_RAW_NEXT(next, entry);
\
+*QLIST_RAW_NEXT(iter, entry) = prev;   
\
+prev = iter;

Re: [PATCH v35 00/13] QEMU AVR 8 bit cores

2019-11-22 Thread Aleksandar Markovic

On Tue, Oct 29, 2019 at 10:25 PM Michael Rolnik  wrote:
>
> This series of patches adds 8bit AVR cores to QEMU.
> All instruction, except BREAK/DES/SPM/SPMX, are implemented. Not fully tested 
> yet.
> However I was able to execute simple code with functions. e.g fibonacci 
> calculation.
> This series of patches include a non real, sample board.
> No fuses support yet. PC is set to 0 at reset.
>
> the patches include the following
> 1. just a basic 8bit AVR CPU, without instruction decoding or translation
> 2. CPU features which allow define the following 8bit AVR cores
>  avr1
>  avr2 avr25
>  avr3 avr31 avr35
>  avr4
>  avr5 avr51
>  avr6
>  xmega2 xmega4 xmega5 xmega6 xmega7
> 3. a definition of sample machine with SRAM, FLASH and CPU which allows to 
> execute simple code
> 4. encoding for all AVR instructions
> 5. interrupt handling
> 6. helpers for IN, OUT, SLEEP, WBR & unsupported instructions
> 7. a decoder which given an opcode decides what istruction it is
> 8. translation of AVR instruction into TCG
> 9. all features together
>

Hello Michael.

I noticed some imperfection in your patches: If your patch contains
changes in, let's say, cpu.h and cpu.c files, your diff orders the
chunks like this: first changes to cpu.c, and after that, cpu.h (sa,
alphabetically), making the review of the patch a little unegronomic.
Could you please run the script scrpits/git.orderfile in your dev
directory, so that your git diffs are organized better?

Thanks,
Aleksandar

P.S. The entire script content:

#
# order file for git, to produce patches which are easier to review
# by diffing the important stuff like interface changes first.
#
# one-off usage:
#   git diff -O scripts/git.orderfile ...
#
# add to git config:
#   git config diff.orderFile scripts/git.orderfile
#

# Documentation
docs/*
*.texi

# build system
configure
Makefile*
*.mak

# qapi schema
qapi/*.json
qga/*.json

# headers
*.h

# code
*.c


> changes since v3
> 1. rampD/X/Y/Z registers are encoded as 0x00ff (instead of 0x00ff) 
> for faster address manipulaton
> 2. ffs changed to ctz32
> 3. duplicate code removed at avr_cpu_do_interrupt
> 4. using andc instead of not + and
> 5. fixing V flag calculation in varios instructions
> 6. freeing local variables in PUSH
> 7. tcg_const_local_i32 -> tcg_const_i32
> 8. using sextract32 instead of my implementation
> 9. fixing BLD instruction
> 10.xor(r) instead of 0xff - r at COM
> 11.fixing MULS/MULSU not to modify inputs' content
> 12.using SUB for NEG
> 13.fixing tcg_gen_qemu_ld/st call in XCH
>
> changes since v4
> 1. target is now defined as big endian in order to optimize push_ret/pop_ret
> 2. all style warnings are fixed
> 3. adding cpu_set/get_sreg functions
> 4. simplifying gen_goto_tb as there is no real paging
> 5. env->pc -> env->pc_w
> 6. making flag dump more compact
> 7. more spacing
> 8. renaming CODE/DATA_INDEX -> MMU_CODE/DATA_IDX
> 9. removing avr_set_feature
> 10. SPL/SPH set bug fix
> 11. switching stb_phys to cpu_stb_data
> 12. cleaning up avr_decode
> 13. saving sreg, rampD/X/Y/Z, eind in HW format (savevm)
> 14. saving CPU features (savevm)
>
> changes since v5
> 1. BLD bug fix
> 2. decoder generator is added
>
> chages since v6
> 1. using cpu_get_sreg/cpu_set_sreg in 
> avr_cpu_gdb_read_register/avr_cpu_gdb_write_register
> 2. configure the target as little endian because otherwise GDB does not work
> 3. fixing and testing gen_push_ret/gen_pop_ret
>
> changes since v7
> 1. folding back v6
> 2. logging at helper_outb and helper_inb are done for non supported yet 
> registers only
> 3. MAINTAINERS updated
>
> changes since v8
> 1. removing hw/avr from hw/Makefile.obj as it should not be built for all
> 2. making linux compilable
> 3. testing on
> a. Mac, Apple LLVM version 7.0.0
> b. Ubuntu 12.04, gcc 4.9.2
> c. Fedora 23, gcc 5.3.1
> 4. folding back some patches
> 5. translation bug fixes for ORI, CPI, XOR instructions
> 6. propper handling of cpu register writes though memory
>
> changes since v9
> 1. removing forward declarations of static functions
> 2. disabling debug prints
> 3. switching to case range instead of if else if ...
> 4. LD/ST IN/OUT accessing CPU maintainder registers are not routed to any 
> device
> 5. commenst about sample board and sample IO device added
> 6. sample board description is more descriptive now
> 7. memory_region_allocate_system_memory is used to create RAM
> 8. now there are helper_fullrd & helper_fullwr when LD/ST try to access 
> registers
>
> changes since v10
> 1. movig back fullwr & fullrd into the commit where outb and inb were 
> introduced
> 2. changing tlb_fill function signature
> 3. adding empty line between functions
> 4. adding newline on the last line of the file
> 5. using tb->flags to generae full access ST/LD instructions
> 6. fixing SBRC bug
> 7. folding back 10th commit
> 8. whenever a new file is introduced it's added to Makefile.objs
>
> changes since v11
> 1. updating to

Re: [PATCH v3 3/8] qcow2: Declare BDRV_REQ_NO_FALLBACK supported

2019-11-22 Thread Eric Blake


On 11/22/19 10:05 AM, Kevin Wolf wrote:

In the common case, qcow2_co_pwrite_zeroes() already only modifies
metadata case, so we're fine with or without BDRV_REQ_NO_FALLBACK set.

The only exception is when using an external data file, where the
request is passed down to the block driver of the external data file. We
are forwarding the BDRV_REQ_NO_FALLBACK flag there, though, so this is
fine, too.

Declare the flag supported therefore.

Signed-off-by: Kevin Wolf 
---
  block/qcow2.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Eric Blake 



diff --git a/block/qcow2.c b/block/qcow2.c
index b201383c3d..3fa10bf807 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1722,7 +1722,8 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
  }
  }
  
-bs->supported_zero_flags = header.version >= 3 ? BDRV_REQ_MAY_UNMAP : 0;

+bs->supported_zero_flags = header.version >= 3 ?
+   BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK : 0;
  
  /* Repair image if dirty */

  if (!(flags & (BDRV_O_CHECK | BDRV_O_INACTIVE)) && !bs->read_only &&



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v35 10/13] target/avr: Add limited support for USART and 16 bit timer peripherals

2019-11-22 Thread Aleksandar Markovic

On Tue, Oct 29, 2019 at 10:25 PM Michael Rolnik  wrote:
>
> From: Sarah Harris 
>
> These were designed to facilitate testing but should provide enough function 
> to be useful in other contexts.
> Only a subset of the functions of each peripheral is implemented, mainly due 
> to the lack of a standard way to handle electrical connections (like GPIO 
> pins).
>
> Signed-off-by: Sarah Harris 
> ---
>  hw/char/Kconfig|   3 +
>  hw/char/Makefile.objs  |   1 +
>  hw/char/avr_usart.c| 324 ++
>  hw/misc/Kconfig|   3 +
>  hw/misc/Makefile.objs  |   2 +
>  hw/misc/avr_mask.c | 112 ++
>  hw/timer/Kconfig   |   3 +
>  hw/timer/Makefile.objs |   2 +
>  hw/timer/avr_timer16.c | 605 +
>  include/hw/char/avr_usart.h|  97 ++
>  include/hw/misc/avr_mask.h |  47 +++
>  include/hw/timer/avr_timer16.h |  97 ++
>  12 files changed, 1296 insertions(+)
>  create mode 100644 hw/char/avr_usart.c
>  create mode 100644 hw/misc/avr_mask.c
>  create mode 100644 hw/timer/avr_timer16.c
>  create mode 100644 include/hw/char/avr_usart.h
>  create mode 100644 include/hw/misc/avr_mask.h
>  create mode 100644 include/hw/timer/avr_timer16.h
>
> diff --git a/hw/char/Kconfig b/hw/char/Kconfig
> index 40e7a8b8bb..331b20983f 100644
> --- a/hw/char/Kconfig
> +++ b/hw/char/Kconfig
> @@ -46,3 +46,6 @@ config SCLPCONSOLE
>
>  config TERMINAL3270
>  bool
> +
> +config AVR_USART
> +bool
> diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
> index 02d8a66925..f05c1f5667 100644
> --- a/hw/char/Makefile.objs
> +++ b/hw/char/Makefile.objs
> @@ -21,6 +21,7 @@ obj-$(CONFIG_PSERIES) += spapr_vty.o
>  obj-$(CONFIG_DIGIC) += digic-uart.o
>  obj-$(CONFIG_STM32F2XX_USART) += stm32f2xx_usart.o
>  obj-$(CONFIG_RASPI) += bcm2835_aux.o
> +common-obj-$(CONFIG_AVR_USART) += avr_usart.o
>
>  common-obj-$(CONFIG_CMSDK_APB_UART) += cmsdk-apb-uart.o
>  common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
> diff --git a/hw/char/avr_usart.c b/hw/char/avr_usart.c
> new file mode 100644
> index 00..9ca3c2a1cd
> --- /dev/null
> +++ b/hw/char/avr_usart.c
> @@ -0,0 +1,324 @@
> +/*
> + * AVR USART
> + *
> + * Copyright (c) 2018 University of Kent
> + * Author: Sarah Harris
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/char/avr_usart.h"
> +#include "qemu/log.h"
> +#include "hw/irq.h"
> +#include "hw/qdev-properties.h"
> +
> +static int avr_usart_can_receive(void *opaque)
> +{
> +AVRUsartState *usart = opaque;
> +
> +if (usart->data_valid || !(usart->csrb & USART_CSRB_RXEN)) {
> +return 0;
> +}
> +return 1;
> +}
> +
> +static void avr_usart_receive(void *opaque, const uint8_t *buffer, int size)
> +{
> +AVRUsartState *usart = opaque;
> +assert(size == 1);
> +assert(!usart->data_valid);
> +usart->data = buffer[0];
> +usart->data_valid = true;
> +usart->csra |= USART_CSRA_RXC;
> +if (usart->csrb & USART_CSRB_RXCIE) {
> +qemu_set_irq(usart->rxc_irq, 1);
> +}
> +}
> +
> +static void update_char_mask(AVRUsartState *usart)
> +{
> +uint8_t mode = ((usart->csrc & USART_CSRC_CSZ0) ? 1 : 0) |
> +((usart->csrc & USART_CSRC_CSZ1) ? 2 : 0) |
> +((usart->csrb & USART_CSRB_CSZ2) ? 4 : 0);
> +switch (mode) {
> +case 0:
> +usart->char_mask = 0b1;
> +break;
> +case 1:
> +usart->char_mask = 0b11;
> +break;
> +case 2:
> +usart->char_mask = 0b111;
> +break;
> +case 3:
> +usart->char_mask = 0b;
> +break;
> +case 4:
> +/* Fallthrough. */
> +case 5:
> +/* Fallthrough. */
> +case 6:
> +qemu_log_mask(
> +LOG_GUEST_ERROR,
> +"%s: Reserved character size

Re: [RFC PATCH-for-5.0] hw/pci-host: Add Kconfig selector for IGD PCIe pass-through

2019-11-22 Thread Alex Williamson

On Fri, 22 Nov 2019 18:22:01 +0100
Philippe Mathieu-Daudé  wrote:

> Introduce a kconfig selector to allow builds without Intel
> Integrated Graphics Device GPU PCIe passthrough.
> We keep the default as enabled.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because to be able to use the Kconfig-generated
> "config-devices.h" header we have to move this device
> out of $common-obj and build i440fx.o on a per-target
> basis, which is not optimal...
> ---
>  hw/pci-host/i440fx.c  | 9 -
>  hw/vfio/pci-quirks.c  | 6 ++
>  hw/pci-host/Kconfig   | 5 +
>  hw/pci-host/Makefile.objs | 2 +-
>  4 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
> index f27131102d..41e93581f4 100644
> --- a/hw/pci-host/i440fx.c
> +++ b/hw/pci-host/i440fx.c
> @@ -34,6 +34,7 @@
>  #include "hw/pci-host/pam.h"
>  #include "qapi/visitor.h"
>  #include "qemu/error-report.h"
> +#include "config-devices.h"
>  
>  /*
>   * I440FX chipset data sheet.
> @@ -386,6 +387,8 @@ static const TypeInfo i440fx_info = {
>  },
>  };
>  
> +#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
> +
>  /* IGD Passthrough Host Bridge. */
>  typedef struct {
>  uint8_t offset;
> @@ -470,6 +473,8 @@ static const TypeInfo igd_passthrough_i440fx_info = {
>  .class_init= igd_passthrough_i440fx_class_init,
>  };
>  
> +#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
> +
>  static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
>  PCIBus *rootbus)
>  {
> @@ -514,8 +519,10 @@ static const TypeInfo i440fx_pcihost_info = {
>  static void i440fx_register_types(void)
>  {
>  type_register_static(_info);
> -type_register_static(_passthrough_i440fx_info);
>  type_register_static(_pcihost_info);
> +#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
> +type_register_static(_passthrough_i440fx_info);
> +#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
>  }
>  

Note that this IGD thing has nothing to do with the one below in vfio
code.  AIUI, the one above is specific to Xen and very unfortunately
named and placed to seem more generic than it is.  vfio IGD
*assignment* (not passthrough) has no dependency on this, so please
don't link them together.  Thanks,

Alex

>  type_init(i440fx_register_types)
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index 136f3a9ad6..858148fa39 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -1166,6 +1166,8 @@ static void vfio_probe_rtl8168_bar2_quirk(VFIOPCIDevice 
> *vdev, int nr)
>  trace_vfio_quirk_rtl8168_probe(vdev->vbasedev.name);
>  }
>  
> +#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
> +
>  /*
>   * Intel IGD support
>   *
> @@ -1811,6 +1813,8 @@ out:
>  g_free(lpc);
>  }
>  
> +#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
> +
>  /*
>   * Common quirk probe entry points.
>   */
> @@ -1860,7 +1864,9 @@ void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr)
>  vfio_probe_nvidia_bar5_quirk(vdev, nr);
>  vfio_probe_nvidia_bar0_quirk(vdev, nr);
>  vfio_probe_rtl8168_bar2_quirk(vdev, nr);
> +#ifdef CONFIG_INTEL_IGD_PASSTHROUGH
>  vfio_probe_igd_bar4_quirk(vdev, nr);
> +#endif /* CONFIG_INTEL_IGD_PASSTHROUGH */
>  }
>  
>  void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr)
> diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig
> index b0aa8351c4..0b7539765a 100644
> --- a/hw/pci-host/Kconfig
> +++ b/hw/pci-host/Kconfig
> @@ -1,6 +1,10 @@
>  config PAM
>  bool
>  
> +config INTEL_IGD_PASSTHROUGH
> +default y
> +bool
> +
>  config PREP_PCI
>  bool
>  select PCI
> @@ -32,6 +36,7 @@ config PCI_I440FX
>  bool
>  select PCI
>  select PAM
> +imply INTEL_IGD_PASSTHROUGH
>  
>  config PCI_EXPRESS_Q35
>  bool
> diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
> index efd752b766..3c925192dd 100644
> --- a/hw/pci-host/Makefile.objs
> +++ b/hw/pci-host/Makefile.objs
> @@ -13,7 +13,7 @@ common-obj-$(CONFIG_VERSATILE_PCI) += versatile.o
>  
>  common-obj-$(CONFIG_PCI_SABRE) += sabre.o
>  common-obj-$(CONFIG_FULONG) += bonito.o
> -common-obj-$(CONFIG_PCI_I440FX) += i440fx.o
> +obj-$(CONFIG_PCI_I440FX) += i440fx.o
>  common-obj-$(CONFIG_PCI_EXPRESS_Q35) += q35.o
>  common-obj-$(CONFIG_PCI_EXPRESS_GENERIC_BRIDGE) += gpex.o
>  common-obj-$(CONFIG_PCI_EXPRESS_XILINX) += xilinx-pcie.o

Re: [PATCH v35 01/13] target/avr: Add outward facing interfaces and core CPU logic

2019-11-22 Thread Aleksandar Markovic

> +/* Number of CPU registers */
> +#define NO_CPU_REGISTERS 32
> +/* Number of IO registers accessible by ld/st/in/out */
> +#define NO_IO_REGISTERS 64

Hi again, Michael. :)

May I ask you to do a global replace of names of these two constants
to CPU_REGISTERS_COUNT / IO_REGISTERS_COUNT or NUMBER_OF_CPU_REGISTERS
/ NUMBER_OF_IO_REGISTERS, or whatever else you find suitable (the
reason being "NO_" is visually/perceptually very confusing - many
readers would have first impression that it means a negative ("no"),
not a "number of" as you, for sure, want.

Thanks,
Aleksandar

Re: [PATCH 4/4] virtiofsd: Implement blocking posix locks

2019-11-22 Thread Dr. David Alan Gilbert

* Vivek Goyal (vgo...@redhat.com) wrote:
> As of now we don't support fcntl(F_SETLKW) and if we see one, we return
> -EOPNOTSUPP.
> 
> Change that by accepting these requests and returning a reply immediately
> asking caller to wait. Once lock is available, send a notification to
> the waiter indicating lock is available.
> 
> Signed-off-by: Vivek Goyal 
> ---
>  contrib/virtiofsd/fuse_kernel.h|  7 +++
>  contrib/virtiofsd/fuse_lowlevel.c  | 23 +++-
>  contrib/virtiofsd/fuse_lowlevel.h  | 25 
>  contrib/virtiofsd/fuse_virtio.c| 94 --
>  contrib/virtiofsd/passthrough_ll.c | 49 +---
>  5 files changed, 182 insertions(+), 16 deletions(-)
> 
> diff --git a/contrib/virtiofsd/fuse_kernel.h b/contrib/virtiofsd/fuse_kernel.h
> index 2bdc8b1c88..d4d65c5414 100644
> --- a/contrib/virtiofsd/fuse_kernel.h
> +++ b/contrib/virtiofsd/fuse_kernel.h
> @@ -444,6 +444,7 @@ enum fuse_notify_code {
>   FUSE_NOTIFY_STORE = 4,
>   FUSE_NOTIFY_RETRIEVE = 5,
>   FUSE_NOTIFY_DELETE = 6,
> + FUSE_NOTIFY_LOCK = 7,
>   FUSE_NOTIFY_CODE_MAX,
>  };
>  
> @@ -836,6 +837,12 @@ struct fuse_notify_retrieve_in {
>   uint64_tdummy4;
>  };
>  
> +struct fuse_notify_lock_out {
> + uint64_tid;
> + int32_t error;
> + int32_t padding;
> +};
> +
>  /* Device ioctls: */
>  #define FUSE_DEV_IOC_CLONE   _IOR(229, 0, uint32_t)
>  
> diff --git a/contrib/virtiofsd/fuse_lowlevel.c 
> b/contrib/virtiofsd/fuse_lowlevel.c
> index d4a42d9804..f706e440bf 100644
> --- a/contrib/virtiofsd/fuse_lowlevel.c
> +++ b/contrib/virtiofsd/fuse_lowlevel.c
> @@ -183,7 +183,8 @@ int fuse_send_reply_iov_nofree(fuse_req_t req, int error, 
> struct iovec *iov,
>  {
>   struct fuse_out_header out;
>  
> - if (error <= -1000 || error > 0) {
> + /* error = 1 has been used to signal client to wait for notificaiton */
> + if (error <= -1000 || error > 1) {
>   fuse_log(FUSE_LOG_ERR, "fuse: bad error value: %i\n",   error);
>   error = -ERANGE;
>   }
> @@ -291,6 +292,12 @@ int fuse_reply_err(fuse_req_t req, int err)
>   return send_reply(req, -err, NULL, 0);
>  }
>  
> +int fuse_reply_wait(fuse_req_t req)
> +{
> + /* TODO: This is a hack. Fix it */
> + return send_reply(req, 1, NULL, 0);
> +}
> +
>  void fuse_reply_none(fuse_req_t req)
>  {
>   fuse_free_req(req);
> @@ -2207,6 +2214,20 @@ static int send_notify_iov(struct fuse_session *se, 
> int notify_code,
>   return fuse_send_msg(se, NULL, iov, count);
>  }
>  
> +int fuse_lowlevel_notify_lock(struct fuse_session *se, uint64_t req_id,
> +   int32_t error)
> +{
> + struct fuse_notify_lock_out outarg;
> + struct iovec iov[2];
> +
> + outarg.id = req_id;
> + outarg.error = -error;
> +
> + iov[1].iov_base = 
> + iov[1].iov_len = sizeof(outarg);
> + return send_notify_iov(se, FUSE_NOTIFY_LOCK, iov, 2);
> +}
> +
>  int fuse_lowlevel_notify_poll(struct fuse_pollhandle *ph)
>  {
>   if (ph != NULL) {
> diff --git a/contrib/virtiofsd/fuse_lowlevel.h 
> b/contrib/virtiofsd/fuse_lowlevel.h
> index e664d2d12d..f0a94683b5 100644
> --- a/contrib/virtiofsd/fuse_lowlevel.h
> +++ b/contrib/virtiofsd/fuse_lowlevel.h
> @@ -1251,6 +1251,22 @@ struct fuse_lowlevel_ops {
>   */
>  int fuse_reply_err(fuse_req_t req, int err);
>  
> +/**
> + * Ask caller to wait for lock.
> + *
> + * Possible requests:
> + *   setlkw
> + *
> + * If caller sends a blocking lock request (setlkw), then reply to caller
> + * that wait for lock to be available. Once lock is available caller will
> + * receive a notification with request's unique id. Notification will
> + * carry info whether lock was successfully obtained or not.
> + *
> + * @param req request handle
> + * @return zero for success, -errno for failure to send reply
> + */
> +int fuse_reply_wait(fuse_req_t req);
> +
>  /**
>   * Don't send reply
>   *
> @@ -1704,6 +1720,15 @@ int fuse_lowlevel_notify_delete(struct fuse_session 
> *se,
>  int fuse_lowlevel_notify_store(struct fuse_session *se, fuse_ino_t ino,
>  off_t offset, struct fuse_bufvec *bufv,
>  enum fuse_buf_copy_flags flags);
> +/**
> + * Notify event related to previous lock request
> + *
> + * @param se the session object
> + * @param req_id the id of the request which requested setlkw
> + * @param error zero for success, -errno for the failure
> + */
> +int fuse_lowlevel_notify_lock(struct fuse_session *se, uint64_t req_id,
> +   int32_t error);
>  
>  /* --- *
>   * Utility functions*
> diff --git a/contrib/virtiofsd/fuse_virtio.c b/contrib/virtiofsd/fuse_virtio.c
> index 982b6ad0bd..98d27e7642 100644
> --- a/contrib/virtiofsd/fuse_virtio.c
> +++ b/contrib/virtiofsd/fuse_virtio.c
> @@ -215,6 +215,81 @@ static void

[PATCH v2 7/9] monitor/hmp: move remaining hmp_block* functions to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 63 +
 monitor/hmp-cmds.c | 64 --
 2 files changed, 63 insertions(+), 64 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index f3d22c7dd3..76951352b1 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -337,3 +337,66 @@ void hmp_snapshot_delete_blkdev_internal(Monitor *mon, 
const QDict *qdict)
true, name, );
 hmp_handle_error(mon, );
 }
+
+void hmp_block_resize(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+int64_t size = qdict_get_int(qdict, "size");
+Error *err = NULL;
+
+qmp_block_resize(true, device, false, NULL, size, );
+hmp_handle_error(mon, );
+}
+
+void hmp_block_stream(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+const char *base = qdict_get_try_str(qdict, "base");
+int64_t speed = qdict_get_try_int(qdict, "speed", 0);
+
+qmp_block_stream(true, device, device, base != NULL, base, false, NULL,
+ false, NULL, qdict_haskey(qdict, "speed"), speed, true,
+ BLOCKDEV_ON_ERROR_REPORT, false, false, false, false,
+ );
+
+hmp_handle_error(mon, );
+}
+
+void hmp_block_passwd(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+const char *password = qdict_get_str(qdict, "password");
+Error *err = NULL;
+
+qmp_block_passwd(true, device, false, NULL, password, );
+hmp_handle_error(mon, );
+}
+
+void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+char *device = (char *) qdict_get_str(qdict, "device");
+BlockIOThrottle throttle = {
+.bps = qdict_get_int(qdict, "bps"),
+.bps_rd = qdict_get_int(qdict, "bps_rd"),
+.bps_wr = qdict_get_int(qdict, "bps_wr"),
+.iops = qdict_get_int(qdict, "iops"),
+.iops_rd = qdict_get_int(qdict, "iops_rd"),
+.iops_wr = qdict_get_int(qdict, "iops_wr"),
+};
+
+/* qmp_block_set_io_throttle has separate parameters for the
+ * (deprecated) block device name and the qdev ID but the HMP
+ * version has only one, so we must decide which one to pass. */
+if (blk_by_name(device)) {
+throttle.has_device = true;
+throttle.device = device;
+} else {
+throttle.has_id = true;
+throttle.id = device;
+}
+
+qmp_block_set_io_throttle(, );
+hmp_handle_error(mon, );
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 2acdcd6e1e..8be48e0af6 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1309,16 +1309,6 @@ void hmp_set_link(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_block_passwd(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-const char *password = qdict_get_str(qdict, "password");
-Error *err = NULL;
-
-qmp_block_passwd(true, device, false, NULL, password, );
-hmp_handle_error(mon, );
-}
-
 void hmp_balloon(Monitor *mon, const QDict *qdict)
 {
 int64_t value = qdict_get_int(qdict, "value");
@@ -1328,17 +1318,6 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_block_resize(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-int64_t size = qdict_get_int(qdict, "size");
-Error *err = NULL;
-
-qmp_block_resize(true, device, false, NULL, size, );
-hmp_handle_error(mon, );
-}
-
-
 void hmp_loadvm(Monitor *mon, const QDict *qdict)
 {
 int saved_vm_running  = runstate_is_running();
@@ -1887,49 +1866,6 @@ void hmp_change(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict)
-{
-Error *err = NULL;
-char *device = (char *) qdict_get_str(qdict, "device");
-BlockIOThrottle throttle = {
-.bps = qdict_get_int(qdict, "bps"),
-.bps_rd = qdict_get_int(qdict, "bps_rd"),
-.bps_wr = qdict_get_int(qdict, "bps_wr"),
-.iops = qdict_get_int(qdict, "iops"),
-.iops_rd = qdict_get_int(qdict, "iops_rd"),
-.iops_wr = qdict_get_int(qdict, "iops_wr"),
-};
-
-/* qmp_block_set_io_throttle has separate parameters for the
- * (deprecated) block device name and the qdev ID but the HMP
- * version has only one, so we must decide which one to pass. */
-if (blk_by_name(device)) {
-throttle.has_device = true;
-throttle.device = device;
-} else {
-throttle.has_id = true;
-throttle.id = device;
-}
-
-qmp_block_set_io_throttle(, );
-hmp_handle_error(mon, );
-}
-
-void hmp_block_stream(Monitor *mon, const

Re: [PATCH v35 01/13] target/avr: Add outward facing interfaces and core CPU logic

2019-11-22 Thread Aleksandar Markovic

> +#ifndef CONFIG_USER_ONLY
> +/* Set the number of interrupts supported by the CPU. */
> +qdev_init_gpio_in(DEVICE(cpu), avr_cpu_set_int, 57);
> +#endif

Can you please, Michael, explain to me the origin of number "57" here?

Thanks, Aleksandar

[PATCH v2 8/9] monitor/hmp: move hmp_info_block* to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 247 +
 monitor/hmp-cmds.c | 245 
 2 files changed, 247 insertions(+), 245 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 76951352b1..c943dccd03 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -33,6 +33,7 @@
 #include "sysemu/sysemu.h"
 #include "monitor/monitor.h"
 #include "block/block_int.h"
+#include "block/qapi.h"
 #include "qapi/qapi-commands-block.h"
 #include "qapi/qmp/qerror.h"
 #include "monitor/hmp.h"
@@ -400,3 +401,249 @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict 
*qdict)
 qmp_block_set_io_throttle(, );
 hmp_handle_error(mon, );
 }
+
+static void print_block_info(Monitor *mon, BlockInfo *info,
+ BlockDeviceInfo *inserted, bool verbose)
+{
+ImageInfo *image_info;
+
+assert(!info || !info->has_inserted || info->inserted == inserted);
+
+if (info && *info->device) {
+monitor_printf(mon, "%s", info->device);
+if (inserted && inserted->has_node_name) {
+monitor_printf(mon, " (%s)", inserted->node_name);
+}
+} else {
+assert(info || inserted);
+monitor_printf(mon, "%s",
+   inserted && inserted->has_node_name ? 
inserted->node_name
+   : info && info->has_qdev ? info->qdev
+   : "");
+}
+
+if (inserted) {
+monitor_printf(mon, ": %s (%s%s%s)\n",
+   inserted->file,
+   inserted->drv,
+   inserted->ro ? ", read-only" : "",
+   inserted->encrypted ? ", encrypted" : "");
+} else {
+monitor_printf(mon, ": [not inserted]\n");
+}
+
+if (info) {
+if (info->has_qdev) {
+monitor_printf(mon, "Attached to:  %s\n", info->qdev);
+}
+if (info->has_io_status && info->io_status != 
BLOCK_DEVICE_IO_STATUS_OK) {
+monitor_printf(mon, "I/O status:   %s\n",
+   BlockDeviceIoStatus_str(info->io_status));
+}
+
+if (info->removable) {
+monitor_printf(mon, "Removable device: %slocked, tray %s\n",
+   info->locked ? "" : "not ",
+   info->tray_open ? "open" : "closed");
+}
+}
+
+
+if (!inserted) {
+return;
+}
+
+monitor_printf(mon, "Cache mode:   %s%s%s\n",
+   inserted->cache->writeback ? "writeback" : "writethrough",
+   inserted->cache->direct ? ", direct" : "",
+   inserted->cache->no_flush ? ", ignore flushes" : "");
+
+if (inserted->has_backing_file) {
+monitor_printf(mon,
+   "Backing file: %s "
+   "(chain depth: %" PRId64 ")\n",
+   inserted->backing_file,
+   inserted->backing_file_depth);
+}
+
+if (inserted->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF) {
+monitor_printf(mon, "Detect zeroes:%s\n",
+BlockdevDetectZeroesOptions_str(inserted->detect_zeroes));
+}
+
+if (inserted->bps  || inserted->bps_rd  || inserted->bps_wr  ||
+inserted->iops || inserted->iops_rd || inserted->iops_wr)
+{
+monitor_printf(mon, "I/O throttling:   bps=%" PRId64
+" bps_rd=%" PRId64  " bps_wr=%" PRId64
+" bps_max=%" PRId64
+" bps_rd_max=%" PRId64
+" bps_wr_max=%" PRId64
+" iops=%" PRId64 " iops_rd=%" PRId64
+" iops_wr=%" PRId64
+" iops_max=%" PRId64
+" iops_rd_max=%" PRId64
+" iops_wr_max=%" PRId64
+" iops_size=%" PRId64
+" group=%s\n",
+inserted->bps,
+inserted->bps_rd,
+inserted->bps_wr,
+inserted->bps_max,
+inserted->bps_rd_max,
+inserted->bps_wr_max,
+inserted->iops,
+inserted->iops_rd,
+inserted->iops_wr,
+inserted->iops_max,
+inserted->iops_rd_max,
+inserted->iops_wr_max,
+inserted->iops_size,
+inserted->group);
+}
+
+if (verbose) {
+monitor_printf(mon, "\nImages:\n");
+image_info = inserted->image;
+while (1) {
+bdrv_image_info_dump(image_info);
+if (image_info->has_backing_image) {
+image_info = image_info->backing_image;
+

[PATCH v2 9/9] monitor/hmp: Prefer to use hmp_handle_error for error reporting in block hmp commands

2019-11-22 Thread Maxim Levitsky

This way they all will be prefixed with 'Error:' which some parsers
(e.g libvirt) need

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 35 --
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index c943dccd03..197994716f 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -59,7 +59,6 @@ void hmp_drive_add(Monitor *mon, const QDict *qdict)
 mc = MACHINE_GET_CLASS(current_machine);
 dinfo = drive_new(opts, mc->block_default_type, );
 if (err) {
-error_report_err(err);
 qemu_opts_del(opts);
 goto err;
 }
@@ -73,7 +72,7 @@ void hmp_drive_add(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "OK\n");
 break;
 default:
-monitor_printf(mon, "Can't hot-add drive to type %d\n", dinfo->type);
+error_setg(, "Can't hot-add drive to type %d", dinfo->type);
 goto err;
 }
 return;
@@ -84,6 +83,7 @@ err:
 monitor_remove_blk(blk);
 blk_unref(blk);
 }
+hmp_handle_error(mon, );
 }
 
 void hmp_drive_del(Monitor *mon, const QDict *qdict)
@@ -105,14 +105,14 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
 
 blk = blk_by_name(id);
 if (!blk) {
-error_report("Device '%s' not found", id);
-return;
+error_setg(_err, "Device '%s' not found", id);
+goto err;
 }
 
 if (!blk_legacy_dinfo(blk)) {
-error_report("Deleting device added with blockdev-add"
- " is not supported");
-return;
+error_setg(_err,
+   "Deleting device added with blockdev-add is not supported");
+goto err;
 }
 
 aio_context = blk_get_aio_context(blk);
@@ -121,9 +121,8 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
 bs = blk_bs(blk);
 if (bs) {
 if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, _err)) {
-error_report_err(local_err);
 aio_context_release(aio_context);
-return;
+goto err;
 }
 
 blk_remove_bs(blk);
@@ -144,12 +143,15 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
 }
 
 aio_context_release(aio_context);
+err:
+hmp_handle_error(mon, _err);
 }
 
 void hmp_commit(Monitor *mon, const QDict *qdict)
 {
 const char *device = qdict_get_str(qdict, "device");
 BlockBackend *blk;
+Error *local_err = NULL;
 int ret;
 
 if (!strcmp(device, "all")) {
@@ -160,12 +162,12 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
 
 blk = blk_by_name(device);
 if (!blk) {
-error_report("Device '%s' not found", device);
-return;
+error_setg(_err, "Device '%s' not found", device);
+goto err;
 }
 if (!blk_is_available(blk)) {
-error_report("Device '%s' has no medium", device);
-return;
+error_setg(_err, "Device '%s' has no medium", device);
+goto err;
 }
 
 bs = blk_bs(blk);
@@ -177,8 +179,13 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
 aio_context_release(aio_context);
 }
 if (ret < 0) {
-error_report("'commit' error for '%s': %s", device, strerror(-ret));
+error_setg(_err,
+   "'commit' error for '%s': %s", device, strerror(-ret));
+goto err;
 }
+return;
+err:
+hmp_handle_error(mon, _err);
 }
 
 void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
-- 
2.17.2

Re: [PATCH v2 4/5] s390x: Move clear reset

2019-11-22 Thread Janosch Frank

On 11/22/19 3:30 PM, David Hildenbrand wrote:
> On 22.11.19 15:00, Janosch Frank wrote:
>> Let's also move the clear reset function into the reset handler.
>>
>> Signed-off-by: Janosch Frank 
>> ---
>>  target/s390x/cpu-qom.h |  1 +
>>  target/s390x/cpu.c | 50 --
>>  2 files changed, 10 insertions(+), 41 deletions(-)
>>
>> diff --git a/target/s390x/cpu-qom.h b/target/s390x/cpu-qom.h
>> index 6f0a12042e..dbe5346ec9 100644
>> --- a/target/s390x/cpu-qom.h
>> +++ b/target/s390x/cpu-qom.h
>> @@ -37,6 +37,7 @@ typedef struct S390CPUDef S390CPUDef;
>>  typedef enum cpu_reset_type {
>>  S390_CPU_RESET_NORMAL,
>>  S390_CPU_RESET_INITIAL,
>> +S390_CPU_RESET_CLEAR,
>>  } cpu_reset_type;
>>  
>>  /**
>> diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
>> index 1f423fb676..017181fe4a 100644
>> --- a/target/s390x/cpu.c
>> +++ b/target/s390x/cpu.c
>> @@ -94,6 +94,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>> type)
>>  s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>>  
>>  switch (type) {
>> +case S390_CPU_RESET_CLEAR:
>> +memset(env, 0, offsetof(CPUS390XState, start_initial_reset_fields));
> 
> I think the preferred term in QEMU is "fall through".
> 
>> +/* Fallthrough */
>>  case S390_CPU_RESET_INITIAL:
>>  /* initial reset does not clear everything! */
>>  memset(>start_initial_reset_fields, 0,
>> @@ -121,46 +124,6 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>> type)
>>  }
>>  }
>>  
>> -/* CPUClass:reset() */
>> -static void s390_cpu_full_reset(CPUState *s)
>> -{
>> -S390CPU *cpu = S390_CPU(s);
>> -S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
>> -CPUS390XState *env = >env;
>> -
>> -scc->parent_reset(s);
>> -cpu->env.sigp_order = 0;
>> -s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>> -
>> -memset(env, 0, offsetof(CPUS390XState, end_reset_fields));
>> -
>> -/* architectured initial values for CR 0 and 14 */
>> -env->cregs[0] = CR0_RESET;
>> -env->cregs[14] = CR14_RESET;
>> -
>> -#if defined(CONFIG_USER_ONLY)
>> -/* user mode should always be allowed to use the full FPU */
>> -env->cregs[0] |= CR0_AFP;
>> -if (s390_has_feat(S390_FEAT_VECTOR)) {
>> -env->cregs[0] |= CR0_VECTOR;
>> -}
>> -#endif
> 
> Huh, what happened to that change?

Btw., wouldn't we need that for both initial and clear reset?

> 
> Note that we now also do "env->bpbc = false" - is that ok?
> 




signature.asc
Description: OpenPGP digital signature

Re: [PATCH v3 2/8] block: Add no_fallback parameter to bdrv_co_truncate()

2019-11-22 Thread Eric Blake


On 11/22/19 10:05 AM, Kevin Wolf wrote:

This adds a no_fallback parameter to bdrv_co_truncate(), bdrv_truncate()
and blk_truncate() in preparation for a fix that potentially needs to
zero-write the new area. no_fallback will use BDRV_REQ_NO_FALLBACK for
this operation and lets the truncate fail if an efficient zero write
isn't possible.

Only qmp_block_resize() passes true for this parameter because it is a
blocking monitor command, so we don't want to add more potentially slow
I/O operations to it than we already have.

All other users will accept even a slow fallback to avoid failure.

Signed-off-by: Kevin Wolf 
---



+++ b/include/block/block.h
@@ -347,9 +347,10 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState 
*bs,
  void bdrv_refresh_filename(BlockDriverState *bs);
  
  int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,

-  PreallocMode prealloc, Error **errp);
+  PreallocMode prealloc, bool no_fallback,
+  Error **errp);
  int bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
-  PreallocMode prealloc, Error **errp);
+  PreallocMode prealloc, bool no_fallback, Error **errp);
  


New signature, most of the changes are mechanical to pass the new 
parameter...



+++ b/block/io.c
@@ -3313,9 +3313,15 @@ static void bdrv_parent_cb_resize(BlockDriverState *bs)
   * If 'exact' is true, the file must be resized to exactly the given
   * 'offset'.  Otherwise, it is sufficient for the node to be at least
   * 'offset' bytes in length.
+ *
+ * If 'no_fallback' is true, a possibly needed writte_zeroes operation to avoid


write


+ * making a longer backing file visible will use BDRV_REQ_NO_FALLBACK. If the
+ * zero write is necessary and this flag is set, bdrv_co_truncate() will fail
+ * if efficient zero writes cannot be provided.
   */



+++ b/qemu-img.c
@@ -3836,7 +3836,7 @@ static int img_resize(int argc, char **argv)
   * resizing, so pass @exact=true.  It is of no use to report
   * success when the image has not actually been resized.
   */
-ret = blk_truncate(blk, total_size, true, prealloc, );
+ret = blk_truncate(blk, total_size, true, prealloc, false, );
  if (!ret) {
  qprintf(quiet, "Image resized.\n");
  } else {


Hmm - thought for a future patch (not this one): are there situations 
where it may be faster to perform bulk pre-zeroing of the tail of a file 
by performing two truncates (smaller and then larger) because we know 
that just-added bytes from a truncate will read as zero?  This may be 
true for some file systems (but is not true for block devices, nor for 
things like NBD that lack resize).  Anyway, unrelated to this patch.


With the typo fixed,

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH v2 3/9] monitor/hmp: move hmp_drive_del and hmp_commit to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 97 +-
 blockdev.c | 95 -
 2 files changed, 96 insertions(+), 96 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 21ff6fa9a9..8884618238 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -33,7 +33,7 @@
 #include "sysemu/sysemu.h"
 #include "monitor/monitor.h"
 #include "block/block_int.h"
-
+#include "qapi/qapi-commands-block.h"
 
 void hmp_drive_add(Monitor *mon, const QDict *qdict)
 {
@@ -82,3 +82,98 @@ err:
 blk_unref(blk);
 }
 }
+
+void hmp_drive_del(Monitor *mon, const QDict *qdict)
+{
+const char *id = qdict_get_str(qdict, "id");
+BlockBackend *blk;
+BlockDriverState *bs;
+AioContext *aio_context;
+Error *local_err = NULL;
+
+bs = bdrv_find_node(id);
+if (bs) {
+qmp_blockdev_del(id, _err);
+if (local_err) {
+error_report_err(local_err);
+}
+return;
+}
+
+blk = blk_by_name(id);
+if (!blk) {
+error_report("Device '%s' not found", id);
+return;
+}
+
+if (!blk_legacy_dinfo(blk)) {
+error_report("Deleting device added with blockdev-add"
+ " is not supported");
+return;
+}
+
+aio_context = blk_get_aio_context(blk);
+aio_context_acquire(aio_context);
+
+bs = blk_bs(blk);
+if (bs) {
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, _err)) {
+error_report_err(local_err);
+aio_context_release(aio_context);
+return;
+}
+
+blk_remove_bs(blk);
+}
+
+/* Make the BlockBackend and the attached BlockDriverState anonymous */
+monitor_remove_blk(blk);
+
+/* If this BlockBackend has a device attached to it, its refcount will be
+ * decremented when the device is removed; otherwise we have to do so here.
+ */
+if (blk_get_attached_dev(blk)) {
+/* Further I/O must not pause the guest */
+blk_set_on_error(blk, BLOCKDEV_ON_ERROR_REPORT,
+ BLOCKDEV_ON_ERROR_REPORT);
+} else {
+blk_unref(blk);
+}
+
+aio_context_release(aio_context);
+}
+
+void hmp_commit(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+BlockBackend *blk;
+int ret;
+
+if (!strcmp(device, "all")) {
+ret = blk_commit_all();
+} else {
+BlockDriverState *bs;
+AioContext *aio_context;
+
+blk = blk_by_name(device);
+if (!blk) {
+error_report("Device '%s' not found", device);
+return;
+}
+if (!blk_is_available(blk)) {
+error_report("Device '%s' has no medium", device);
+return;
+}
+
+bs = blk_bs(blk);
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+
+ret = bdrv_commit(bs);
+
+aio_context_release(aio_context);
+}
+if (ret < 0) {
+error_report("'commit' error for '%s': %s", device, strerror(-ret));
+}
+}
diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..df43e0aaef 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1074,41 +1074,6 @@ static BlockBackend *qmp_get_blk(const char *blk_name, 
const char *qdev_id,
 return blk;
 }
 
-void hmp_commit(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-BlockBackend *blk;
-int ret;
-
-if (!strcmp(device, "all")) {
-ret = blk_commit_all();
-} else {
-BlockDriverState *bs;
-AioContext *aio_context;
-
-blk = blk_by_name(device);
-if (!blk) {
-error_report("Device '%s' not found", device);
-return;
-}
-if (!blk_is_available(blk)) {
-error_report("Device '%s' has no medium", device);
-return;
-}
-
-bs = blk_bs(blk);
-aio_context = bdrv_get_aio_context(bs);
-aio_context_acquire(aio_context);
-
-ret = bdrv_commit(bs);
-
-aio_context_release(aio_context);
-}
-if (ret < 0) {
-error_report("'commit' error for '%s': %s", device, strerror(-ret));
-}
-}
-
 static void blockdev_do_action(TransactionAction *action, Error **errp)
 {
 TransactionActionList list;
@@ -3101,66 +3066,6 @@ BlockDirtyBitmapSha256 
*qmp_x_debug_block_dirty_bitmap_sha256(const char *node,
 return ret;
 }
 
-void hmp_drive_del(Monitor *mon, const QDict *qdict)
-{
-const char *id = qdict_get_str(qdict, "id");
-BlockBackend *blk;
-BlockDriverState *bs;
-AioContext *aio_context;
-Error *local_err = NULL;
-
-bs = bdrv_find_node(id);
-if (bs) {
-qmp_blockdev_del(id, _err);
-if (local_err) {
-error_report_err(local_err);
-}
-return;
-}
-
-blk =

Re: [PATCH v35 01/13] target/avr: Add outward facing interfaces and core CPU logic

2019-11-22 Thread Aleksandar Markovic

> +
> +static void avr_avr1_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +}
> +
> +static void avr_avr2_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +}
> +
> +static void avr_avr25_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_LPMX);
> +avr_set_feature(env, AVR_FEATURE_MOVW);
> +}
> +
> +static void avr_avr3_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_JMP_CALL);
> +}
> +
> +static void avr_avr31_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_RAMPZ);
> +avr_set_feature(env, AVR_FEATURE_ELPM);
> +avr_set_feature(env, AVR_FEATURE_JMP_CALL);
> +}
> +
> +static void avr_avr35_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_JMP_CALL);
> +avr_set_feature(env, AVR_FEATURE_LPMX);
> +avr_set_feature(env, AVR_FEATURE_MOVW);
> +}
> +
> +static void avr_avr4_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_LPMX);
> +avr_set_feature(env, AVR_FEATURE_MOVW);
> +avr_set_feature(env, AVR_FEATURE_MUL);
> +}
> +
> +static void avr_avr5_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_JMP_CALL);
> +avr_set_feature(env, AVR_FEATURE_LPMX);
> +avr_set_feature(env, AVR_FEATURE_MOVW);
> +avr_set_feature(env, AVR_FEATURE_MUL);
> +}
> +
> +static void avr_avr51_initfn(Object *obj)
> +{
> +AVRCPU *cpu = AVR_CPU(obj);
> +CPUAVRState *env = >env;
> +
> +avr_set_feature(env, AVR_FEATURE_LPM);
> +avr_set_feature(env, AVR_FEATURE_IJMP_ICALL);
> +avr_set_feature(env, AVR_FEATURE_ADIW_SBIW);
> +avr_set_feature(env, AVR_FEATURE_SRAM);
> +avr_set_feature(env, AVR_FEATURE_BREAK);
> +
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_PC);
> +avr_set_feature(env, AVR_FEATURE_2_BYTE_SP);
> +avr_set_feature(env, AVR_FEATURE_RAMPZ);
> +avr_set_feature(env,

[PATCH v2 4/9] monitor/hmp: move hmp_drive_mirror and hmp_drive_backup to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 61 ++
 monitor/hmp-cmds.c | 58 
 2 files changed, 61 insertions(+), 58 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 8884618238..5ae899a324 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -34,6 +34,8 @@
 #include "monitor/monitor.h"
 #include "block/block_int.h"
 #include "qapi/qapi-commands-block.h"
+#include "qapi/qmp/qerror.h"
+#include "monitor/hmp.h"
 
 void hmp_drive_add(Monitor *mon, const QDict *qdict)
 {
@@ -177,3 +179,62 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
 error_report("'commit' error for '%s': %s", device, strerror(-ret));
 }
 }
+
+void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
+{
+const char *filename = qdict_get_str(qdict, "target");
+const char *format = qdict_get_try_str(qdict, "format");
+bool reuse = qdict_get_try_bool(qdict, "reuse", false);
+bool full = qdict_get_try_bool(qdict, "full", false);
+Error *err = NULL;
+DriveMirror mirror = {
+.device = (char *)qdict_get_str(qdict, "device"),
+.target = (char *)filename,
+.has_format = !!format,
+.format = (char *)format,
+.sync = full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
+.has_mode = true,
+.mode = reuse ? NEW_IMAGE_MODE_EXISTING : 
NEW_IMAGE_MODE_ABSOLUTE_PATHS,
+.unmap = true,
+};
+
+if (!filename) {
+error_setg(, QERR_MISSING_PARAMETER, "target");
+hmp_handle_error(mon, );
+return;
+}
+qmp_drive_mirror(, );
+hmp_handle_error(mon, );
+}
+
+void hmp_drive_backup(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+const char *filename = qdict_get_str(qdict, "target");
+const char *format = qdict_get_try_str(qdict, "format");
+bool reuse = qdict_get_try_bool(qdict, "reuse", false);
+bool full = qdict_get_try_bool(qdict, "full", false);
+bool compress = qdict_get_try_bool(qdict, "compress", false);
+Error *err = NULL;
+DriveBackup backup = {
+.device = (char *)device,
+.target = (char *)filename,
+.has_format = !!format,
+.format = (char *)format,
+.sync = full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
+.has_mode = true,
+.mode = reuse ? NEW_IMAGE_MODE_EXISTING : 
NEW_IMAGE_MODE_ABSOLUTE_PATHS,
+.has_compress = !!compress,
+.compress = compress,
+};
+
+if (!filename) {
+error_setg(, QERR_MISSING_PARAMETER, "target");
+hmp_handle_error(mon, );
+return;
+}
+
+qmp_drive_backup(, );
+hmp_handle_error(mon, );
+}
+
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index b2551c16d1..aa94a15d74 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1338,64 +1338,6 @@ void hmp_block_resize(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
-{
-const char *filename = qdict_get_str(qdict, "target");
-const char *format = qdict_get_try_str(qdict, "format");
-bool reuse = qdict_get_try_bool(qdict, "reuse", false);
-bool full = qdict_get_try_bool(qdict, "full", false);
-Error *err = NULL;
-DriveMirror mirror = {
-.device = (char *)qdict_get_str(qdict, "device"),
-.target = (char *)filename,
-.has_format = !!format,
-.format = (char *)format,
-.sync = full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
-.has_mode = true,
-.mode = reuse ? NEW_IMAGE_MODE_EXISTING : 
NEW_IMAGE_MODE_ABSOLUTE_PATHS,
-.unmap = true,
-};
-
-if (!filename) {
-error_setg(, QERR_MISSING_PARAMETER, "target");
-hmp_handle_error(mon, );
-return;
-}
-qmp_drive_mirror(, );
-hmp_handle_error(mon, );
-}
-
-void hmp_drive_backup(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-const char *filename = qdict_get_str(qdict, "target");
-const char *format = qdict_get_try_str(qdict, "format");
-bool reuse = qdict_get_try_bool(qdict, "reuse", false);
-bool full = qdict_get_try_bool(qdict, "full", false);
-bool compress = qdict_get_try_bool(qdict, "compress", false);
-Error *err = NULL;
-DriveBackup backup = {
-.device = (char *)device,
-.target = (char *)filename,
-.has_format = !!format,
-.format = (char *)format,
-.sync = full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP,
-.has_mode = true,
-.mode = reuse ? NEW_IMAGE_MODE_EXISTING : 
NEW_IMAGE_MODE_ABSOLUTE_PATHS,
-.has_compress = !!compress,
-.compress = compress,
-};
-
-if (!filename) {
-error_setg(, QERR_MISSING_PARAMETER, "target");
-hmp_handle_error(mon, );
-

[PATCH v2 5/9] monitor/hmp: move hmp_block_job* to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 52 ++
 monitor/hmp-cmds.c | 52 --
 2 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 5ae899a324..e333de27b1 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -238,3 +238,55 @@ void hmp_drive_backup(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
+
+void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+int64_t value = qdict_get_int(qdict, "speed");
+
+qmp_block_job_set_speed(device, value, );
+
+hmp_handle_error(mon, );
+}
+
+void hmp_block_job_cancel(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+bool force = qdict_get_try_bool(qdict, "force", false);
+
+qmp_block_job_cancel(device, true, force, );
+
+hmp_handle_error(mon, );
+}
+
+void hmp_block_job_pause(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+
+qmp_block_job_pause(device, );
+
+hmp_handle_error(mon, );
+}
+
+void hmp_block_job_resume(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+
+qmp_block_job_resume(device, );
+
+hmp_handle_error(mon, );
+}
+
+void hmp_block_job_complete(Monitor *mon, const QDict *qdict)
+{
+Error *error = NULL;
+const char *device = qdict_get_str(qdict, "device");
+
+qmp_block_job_complete(device, );
+
+hmp_handle_error(mon, );
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index aa94a15d74..326276cced 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1976,58 +1976,6 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict)
-{
-Error *error = NULL;
-const char *device = qdict_get_str(qdict, "device");
-int64_t value = qdict_get_int(qdict, "speed");
-
-qmp_block_job_set_speed(device, value, );
-
-hmp_handle_error(mon, );
-}
-
-void hmp_block_job_cancel(Monitor *mon, const QDict *qdict)
-{
-Error *error = NULL;
-const char *device = qdict_get_str(qdict, "device");
-bool force = qdict_get_try_bool(qdict, "force", false);
-
-qmp_block_job_cancel(device, true, force, );
-
-hmp_handle_error(mon, );
-}
-
-void hmp_block_job_pause(Monitor *mon, const QDict *qdict)
-{
-Error *error = NULL;
-const char *device = qdict_get_str(qdict, "device");
-
-qmp_block_job_pause(device, );
-
-hmp_handle_error(mon, );
-}
-
-void hmp_block_job_resume(Monitor *mon, const QDict *qdict)
-{
-Error *error = NULL;
-const char *device = qdict_get_str(qdict, "device");
-
-qmp_block_job_resume(device, );
-
-hmp_handle_error(mon, );
-}
-
-void hmp_block_job_complete(Monitor *mon, const QDict *qdict)
-{
-Error *error = NULL;
-const char *device = qdict_get_str(qdict, "device");
-
-qmp_block_job_complete(device, );
-
-hmp_handle_error(mon, );
-}
-
 typedef struct HMPMigrationStatus
 {
 QEMUTimer *timer;
-- 
2.17.2

[PATCH v2 0/9] RFC: [for 5.0]: HMP monitor handlers cleanups

2019-11-22 Thread Maxim Levitsky

This patch series is bunch of cleanups
to the hmp monitor code.

This series only touched blockdev related hmp handlers.

No functional changes expected other that
light error message changes by the last patch.

This was inspired by this bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=1719169

Basically some users still parse hmp error messages,
and they would like to have them prefixed with 'Error:'

In commit 66363e9a43f649360a3f74d2805c9f864da027eb we added
the hmp_handle_error which does exactl that but some hmp handlers
don't use it.

In this patch series, I moved all the block related hmp handlers
into blockdev-hmp-cmds.c, and then made them use this function
to report the errors.

I hope I didn't change too much code, I just felt that if
I touch this code, I can also make it easier to find these
handlers, that were scattered over 3 different files.

Changes from V1:
   * move the handlers to block/monitor/block-hmp-cmds.c
   * tiny cleanup for the commit messages

Best regards,
Maxim Levitsky

Maxim Levitsky (9):
  monitor/hmp: uninline add_init_drive
  monitor/hmp: rename device-hotplug.c to block/monitor/block-hmp-cmds.c
  monitor/hmp: move hmp_drive_del and hmp_commit to block-hmp-cmds.c
  monitor/hmp: move hmp_drive_mirror and hmp_drive_backup to
block-hmp-cmds.c
  monitor/hmp: move hmp_block_job* to block-hmp-cmds.c
  monitor/hmp: move hmp_snapshot_* to block-hmp-cmds.c
  monitor/hmp: move remaining hmp_block* functions to block-hmp-cmds.c
  monitor/hmp: move hmp_info_block* to block-hmp-cmds.c
  monitor/hmp: Prefer to use hmp_handle_error for error reporting in
block hmp commands

 MAINTAINERS|   1 +
 Makefile.objs  |   2 +-
 block/Makefile.objs|   1 +
 block/monitor/Makefile.objs|   1 +
 block/monitor/block-hmp-cmds.c | 656 +
 blockdev.c |  95 -
 device-hotplug.c   |  91 -
 monitor/hmp-cmds.c | 465 ---
 8 files changed, 660 insertions(+), 652 deletions(-)
 create mode 100644 block/monitor/Makefile.objs
 create mode 100644 block/monitor/block-hmp-cmds.c
 delete mode 100644 device-hotplug.c

-- 
2.17.2

[PATCH v2 2/9] monitor/hmp: rename device-hotplug.c to block/monitor/block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

These days device-hotplug.c only contains the hmp_drive_add
In the next patch, rest of hmp_drive* functions will be moved
there.

Signed-off-by: Maxim Levitsky 
---
 MAINTAINERS| 1 +
 Makefile.objs  | 2 +-
 block/Makefile.objs| 1 +
 block/monitor/Makefile.objs| 1 +
 device-hotplug.c => block/monitor/block-hmp-cmds.c | 2 +-
 5 files changed, 5 insertions(+), 2 deletions(-)
 create mode 100644 block/monitor/Makefile.objs
 rename device-hotplug.c => block/monitor/block-hmp-cmds.c (98%)

diff --git a/MAINTAINERS b/MAINTAINERS
index dfb7932608..658c38edf4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1855,6 +1855,7 @@ Block QAPI, monitor, command line
 M: Markus Armbruster 
 S: Supported
 F: blockdev.c
+F: blockdev-hmp-cmds.c
 F: block/qapi.c
 F: qapi/block*.json
 F: qapi/transaction.json
diff --git a/Makefile.objs b/Makefile.objs
index 11ba1a36bd..e83962db96 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -49,7 +49,7 @@ common-obj-y += dump/
 common-obj-y += job-qmp.o
 common-obj-y += monitor/
 common-obj-y += net/
-common-obj-y += qdev-monitor.o device-hotplug.o
+common-obj-y += qdev-monitor.o
 common-obj-$(CONFIG_WIN32) += os-win32.o
 common-obj-$(CONFIG_POSIX) += os-posix.o
 
diff --git a/block/Makefile.objs b/block/Makefile.objs
index e394fe0b6c..c9e35ab66a 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -43,6 +43,7 @@ block-obj-y += crypto.o
 
 block-obj-y += aio_task.o
 block-obj-y += backup-top.o
+common-obj-y += monitor/
 
 common-obj-y += stream.o
 
diff --git a/block/monitor/Makefile.objs b/block/monitor/Makefile.objs
new file mode 100644
index 00..0a74f9a8b5
--- /dev/null
+++ b/block/monitor/Makefile.objs
@@ -0,0 +1 @@
+common-obj-y += block-hmp-cmds.o
diff --git a/device-hotplug.c b/block/monitor/block-hmp-cmds.c
similarity index 98%
rename from device-hotplug.c
rename to block/monitor/block-hmp-cmds.c
index 5ce73f0cff..21ff6fa9a9 100644
--- a/device-hotplug.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1,5 +1,5 @@
 /*
- * QEMU device hotplug helpers
+ * Blockdev HMP commands
  *
  * Copyright (c) 2004 Fabrice Bellard
  *
-- 
2.17.2

Re: [PATCH v2 4/5] s390x: Move clear reset

2019-11-22 Thread Janosch Frank

On 11/22/19 3:30 PM, David Hildenbrand wrote:
> On 22.11.19 15:00, Janosch Frank wrote:
>> Let's also move the clear reset function into the reset handler.
>>
>> Signed-off-by: Janosch Frank 
>> ---
>>  target/s390x/cpu-qom.h |  1 +
>>  target/s390x/cpu.c | 50 --
>>  2 files changed, 10 insertions(+), 41 deletions(-)
>>
>> diff --git a/target/s390x/cpu-qom.h b/target/s390x/cpu-qom.h
>> index 6f0a12042e..dbe5346ec9 100644
>> --- a/target/s390x/cpu-qom.h
>> +++ b/target/s390x/cpu-qom.h
>> @@ -37,6 +37,7 @@ typedef struct S390CPUDef S390CPUDef;
>>  typedef enum cpu_reset_type {
>>  S390_CPU_RESET_NORMAL,
>>  S390_CPU_RESET_INITIAL,
>> +S390_CPU_RESET_CLEAR,
>>  } cpu_reset_type;
>>  
>>  /**
>> diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
>> index 1f423fb676..017181fe4a 100644
>> --- a/target/s390x/cpu.c
>> +++ b/target/s390x/cpu.c
>> @@ -94,6 +94,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>> type)
>>  s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>>  
>>  switch (type) {
>> +case S390_CPU_RESET_CLEAR:
>> +memset(env, 0, offsetof(CPUS390XState, start_initial_reset_fields));
> 
> I think the preferred term in QEMU is "fall through".
> 
>> +/* Fallthrough */
>>  case S390_CPU_RESET_INITIAL:
>>  /* initial reset does not clear everything! */
>>  memset(>start_initial_reset_fields, 0,
>> @@ -121,46 +124,6 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type 
>> type)
>>  }
>>  }
>>  
>> -/* CPUClass:reset() */
>> -static void s390_cpu_full_reset(CPUState *s)
>> -{
>> -S390CPU *cpu = S390_CPU(s);
>> -S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
>> -CPUS390XState *env = >env;
>> -
>> -scc->parent_reset(s);
>> -cpu->env.sigp_order = 0;
>> -s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
>> -
>> -memset(env, 0, offsetof(CPUS390XState, end_reset_fields));
>> -
>> -/* architectured initial values for CR 0 and 14 */
>> -env->cregs[0] = CR0_RESET;
>> -env->cregs[14] = CR14_RESET;
>> -
>> -#if defined(CONFIG_USER_ONLY)
>> -/* user mode should always be allowed to use the full FPU */
>> -env->cregs[0] |= CR0_AFP;
>> -if (s390_has_feat(S390_FEAT_VECTOR)) {
>> -env->cregs[0] |= CR0_VECTOR;
>> -}
>> -#endif
> 
> Huh, what happened to that change?

Seems like I missed it

> 
> Note that we now also do "env->bpbc = false" - is that ok?

That's ok, clear and initial reset do a memset to bpbc, but as reset
normal doesn't we need to set it explicitly.




signature.asc
Description: OpenPGP digital signature

Re: [PATCH 0/5] ppc/pnv: fix Homer/Occ mappings on multichip systems

2019-11-22 Thread Balamuruhan S

On Thu, Nov 21, 2019 at 11:00:12AM +0100, Cédric Le Goater wrote:
> On 21/11/2019 10:11, Balamuruhan S wrote:
> > On Wed, Nov 20, 2019 at 08:46:30AM +0100, Cédric Le Goater wrote:
> >> Hello,
> >>
> >> On 19/11/2019 18:50, Balamuruhan S wrote:
> >>> Hi All,
> >>>
> >>> PowerNV fails to boot in multichip systems due to some misinterpretation
> >>> and mapping in Homer/Occ device models, this patchset fixes the
> >>> following,
> >>>
> >>>  - Homer size is 4MB per chip and Occ common area size is 8MB
> >>>  - Bar masks are used to calculate sizes of Homer/Occ in skiboot so
> >>>return appropriate value
> >>>  - Occ common area is in BAR 3 on Power8 but wrongly mapped to BAR 2
> >>>currently
> >>>  - OCC common area is shared across chips and should be mapped only once
> >>>for multichip systems
> >>
> >> The first thing to address is the HOMER XSCOM region. 
> >>
> >> Introduce an empty skeleton for P8 and P9 with different mem ops handers
> >> because the registers have a different layout. From there, add the support
> >> for the different PBA* regs and move them out from the default XSCOM
> >> handlers. That should fix most of the current problems and it will provide 
> >> a nice framework for future extensions.
> > 
> > sure, I will work on it.
> > 
> >>
> >> Why not add the associated HOMER MMIO region while we are it ? the PBA
> >> registers have all the definitions we need and it will gives us access
> >> to the pstate table.
> > 
> > so, idea is to have HOMER MMIO for us to use it accessing pstate table / 
> > data
> > and HOMER XSCOM for homer associated xscom access for PBA* registers to
> > P8 and P9 respectively.
> 
> yes. 
> 
> >> Second is the OCC region. Do we need a XSCOM *or* a MMIO region ? This is 
> >> not clear. Please check skiboot. I think a MMIO region should be enough
> >> because this is how sensor data from the OCC is accessed.
> > 
> > Okay, I will do the change for OCC to use MMIO, and will check skiboot
> > for making it better.
> > 
> >>
> >> On that topic, we could define properties on the PnvOCC model for each 
> >> sensor and tune the value from the QEMU monitor. It really shouldn't be
> >> too complex.
> > 
> > How can we tune value from QEMU monitor ? This is new to me and will
> > need to check it. I remember you have advised this with the error
> > injection framework patches and Rashmica's patch that provides way to
> > use Qemu monitor to feed data, but I need to do some study.
> 
> 
> See Joel's patch which has a simple example :  
> 
>patchwork.ozlabs.org/patch/1196519
> 
> It simply generates object properties : 
> 
> 
> +for (led = 0; led < s->nr_leds; led++) {
> +char *name;
> +
> +name = g_strdup_printf("led%d", led);
> +object_property_add(obj, name, "bool", pca9552_get_led, 
> pca9552_set_led,
> +NULL, NULL, NULL);
> +}
> 
> with defined get and set accessors. 
> 
> We could do the same for the OCC sensors with a table describing the 
> sensor layout. Accessors would just simply update the table. we could
> even trigger the OCC interrupt if needed.
> 
> This is the initial table :
> 
>   
> https://github.com/open-power/occ/blob/master/src/occ_405/sensor/sensor_info.c
> 
> Linux should be able to grab the values through hwmon just as on real HW.
> This is the case today for the DTS.

cool...

> 
> >>
> >> Also the same address is used, we should only map it once but we need 
> >> to invent something to know from which chip it is accessed.
> > 
> > This is something need to check how real hardware handles it while
> > accessing shared occ region from different chip and think how to make it
> > for us.
> 
> Yes. I suppose there is some chip id in the powerbus message.

:+1:

> 
> C.
> 
>   
> > 
> > Thanks a lot!
> > 
> > -- Bala
> > 
> >>
> >>
> >> C.
> >>
> >>
> >>>
> >>> Request for your review and suggestions to make it better. I would like to
> >>> thank Cedric for his time and help to figure out the issues.
> >>>
> >>> Balamuruhan S (5):
> >>>   hw/ppc/pnv: incorrect homer and occ common area size
> >>>   hw/ppc/pnv_xscom: PBA bar mask values are incorrect with homer/occ
> >>> sizes
> >>>   hw/ppc/pnv_xscom: Power8 occ common area is in PBA BAR 3
> >>>   hw/ppc/pnv_xscom: occ common area to be mapped only once
> >>>   hw/ppc/pnv_xscom: add PBA BARs for Power8 slw image
> >>>
> >>>  hw/ppc/pnv_occ.c |  2 +-
> >>>  hw/ppc/pnv_xscom.c   | 37 +++--
> >>>  include/hw/ppc/pnv.h | 12 
> >>>  3 files changed, 36 insertions(+), 15 deletions(-)
> >>>
> >>
> > 
>

[PATCH v2 6/9] monitor/hmp: move hmp_snapshot_* to block-hmp-cmds.c

2019-11-22 Thread Maxim Levitsky

Signed-off-by: Maxim Levitsky 
---
 block/monitor/block-hmp-cmds.c | 47 ++
 monitor/hmp-cmds.c | 46 -
 2 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index e333de27b1..f3d22c7dd3 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -290,3 +290,50 @@ void hmp_block_job_complete(Monitor *mon, const QDict 
*qdict)
 
 hmp_handle_error(mon, );
 }
+
+void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+const char *filename = qdict_get_try_str(qdict, "snapshot-file");
+const char *format = qdict_get_try_str(qdict, "format");
+bool reuse = qdict_get_try_bool(qdict, "reuse", false);
+enum NewImageMode mode;
+Error *err = NULL;
+
+if (!filename) {
+/* In the future, if 'snapshot-file' is not specified, the snapshot
+   will be taken internally. Today it's actually required. */
+error_setg(, QERR_MISSING_PARAMETER, "snapshot-file");
+hmp_handle_error(mon, );
+return;
+}
+
+mode = reuse ? NEW_IMAGE_MODE_EXISTING : NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+qmp_blockdev_snapshot_sync(true, device, false, NULL,
+   filename, false, NULL,
+   !!format, format,
+   true, mode, );
+hmp_handle_error(mon, );
+}
+
+void hmp_snapshot_blkdev_internal(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+const char *name = qdict_get_str(qdict, "name");
+Error *err = NULL;
+
+qmp_blockdev_snapshot_internal_sync(device, name, );
+hmp_handle_error(mon, );
+}
+
+void hmp_snapshot_delete_blkdev_internal(Monitor *mon, const QDict *qdict)
+{
+const char *device = qdict_get_str(qdict, "device");
+const char *name = qdict_get_str(qdict, "name");
+const char *id = qdict_get_try_str(qdict, "id");
+Error *err = NULL;
+
+qmp_blockdev_snapshot_delete_internal_sync(device, !!id, id,
+   true, name, );
+hmp_handle_error(mon, );
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 326276cced..2acdcd6e1e 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1338,52 +1338,6 @@ void hmp_block_resize(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, );
 }
 
-void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-const char *filename = qdict_get_try_str(qdict, "snapshot-file");
-const char *format = qdict_get_try_str(qdict, "format");
-bool reuse = qdict_get_try_bool(qdict, "reuse", false);
-enum NewImageMode mode;
-Error *err = NULL;
-
-if (!filename) {
-/* In the future, if 'snapshot-file' is not specified, the snapshot
-   will be taken internally. Today it's actually required. */
-error_setg(, QERR_MISSING_PARAMETER, "snapshot-file");
-hmp_handle_error(mon, );
-return;
-}
-
-mode = reuse ? NEW_IMAGE_MODE_EXISTING : NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-qmp_blockdev_snapshot_sync(true, device, false, NULL,
-   filename, false, NULL,
-   !!format, format,
-   true, mode, );
-hmp_handle_error(mon, );
-}
-
-void hmp_snapshot_blkdev_internal(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-const char *name = qdict_get_str(qdict, "name");
-Error *err = NULL;
-
-qmp_blockdev_snapshot_internal_sync(device, name, );
-hmp_handle_error(mon, );
-}
-
-void hmp_snapshot_delete_blkdev_internal(Monitor *mon, const QDict *qdict)
-{
-const char *device = qdict_get_str(qdict, "device");
-const char *name = qdict_get_str(qdict, "name");
-const char *id = qdict_get_try_str(qdict, "id");
-Error *err = NULL;
-
-qmp_blockdev_snapshot_delete_internal_sync(device, !!id, id,
-   true, name, );
-hmp_handle_error(mon, );
-}
 
 void hmp_loadvm(Monitor *mon, const QDict *qdict)
 {
-- 
2.17.2

[PATCH v2 1/9] monitor/hmp: uninline add_init_drive

2019-11-22 Thread Maxim Levitsky

This is only used by hmp_drive_add.
The code is just a bit shorter this way.

No functional changes

Signed-off-by: Maxim Levitsky 
---
 device-hotplug.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/device-hotplug.c b/device-hotplug.c
index f01d53774b..5ce73f0cff 100644
--- a/device-hotplug.c
+++ b/device-hotplug.c
@@ -34,42 +34,35 @@
 #include "monitor/monitor.h"
 #include "block/block_int.h"
 
-static DriveInfo *add_init_drive(const char *optstr)
+
+void hmp_drive_add(Monitor *mon, const QDict *qdict)
 {
 Error *err = NULL;
-DriveInfo *dinfo;
+DriveInfo *dinfo = NULL;
 QemuOpts *opts;
 MachineClass *mc;
+const char *optstr = qdict_get_str(qdict, "opts");
+bool node = qdict_get_try_bool(qdict, "node", false);
+
+if (node) {
+hmp_drive_add_node(mon, optstr);
+return;
+}
 
 opts = drive_def(optstr);
 if (!opts)
-return NULL;
+return;
 
 mc = MACHINE_GET_CLASS(current_machine);
 dinfo = drive_new(opts, mc->block_default_type, );
 if (err) {
 error_report_err(err);
 qemu_opts_del(opts);
-return NULL;
-}
-
-return dinfo;
-}
-
-void hmp_drive_add(Monitor *mon, const QDict *qdict)
-{
-DriveInfo *dinfo = NULL;
-const char *opts = qdict_get_str(qdict, "opts");
-bool node = qdict_get_try_bool(qdict, "node", false);
-
-if (node) {
-hmp_drive_add_node(mon, opts);
-return;
+goto err;
 }
 
-dinfo = add_init_drive(opts);
 if (!dinfo) {
-goto err;
+return;
 }
 
 switch (dinfo->type) {
-- 
2.17.2

Re: [PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays

2019-11-22 Thread Eric Blake


On 11/22/19 10:17 AM, Peter Maydell wrote:

On Fri, 22 Nov 2019 at 16:08, Kevin Wolf  wrote:


See patch 4 for the description of the bug fixed.


I guess my questions for trying to answer the "for-4.2?"
question in the subject are:
  1) is this a security (leaking data into the guest) bug ?
  2) is this a regression?
  3) is this something a lot of people are likely to run into?


My thoughts (although Kevin's may be more definitive):

1) yes, there is a security aspect: certain resize or commit actions can 
result in the guest seeing a revival of stale data that the guest may 
have thought that it previously scrubbed.  Similarly, the tail end of 
the series proves via iotests that we have an actual case of data 
corruption after a block commit without this patch


2) no, this is a long-standing bug, we've only recently noticed it

3) no, it is uncommon to have an overlay with a size shorter than its 
backing file (it's not even all that common to have an overlay longer 
than the backing file), so this is a corner case not many people will 
hit.  It's even less common to have the difference in overlay sizes also 
coincide with formats that introduce the speed penalty of a longer 
blocking due to the added zeroing.




Eyeballing of the diffstat plus the fact we're on v4 of
the patchset already makes me a little uneasy about
putting it into rc3, but if the bug we're fixing matters
enough we can do it.


In terms of diffstat, the v3 series was much smaller in impact.  Both 
versions add robustness, where the difference between v3 and v4 is 
whether we introduce a speed penalty on an unlikely setup (v3) or reject 
any operation where it would require a speed penalty to avoid data 
problems (v4).  I think all the patches in v3 were reviewed, but I'll go 
ahead and review v4 as well.


Because of point 1, I am leaning towards some version of this patch 
series (whether 3 or 4) making -rc3; but point 2 (it is not a 4.2 
regression) also seems to be a reasonable justification for slipping 
this to 5.0.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH] target/arm: Fix ISR_EL1 tracking when executing at EL2

2019-11-22 Thread Quentin Perret

On Friday 22 Nov 2019 at 13:58:33 (+), Marc Zyngier wrote:
> The ARMv8 ARM states when executing at EL2, EL3 or Secure EL1,
> ISR_EL1 shows the pending status of the physical IRQ, FIQ, or
> SError interrupts.
> 
> Unfortunately, QEMU's implementation only considers the HCR_EL2
> bits, and ignores the current exception level. This means a hypervisor
> trying to look at its own interrupt state actually sees the guest
> state, which is unexpected and breaks KVM as of Linux 5.3.
> 
> Instead, check for the running EL and return the physical bits
> if not running in a virtualized context.
> 
> Fixes: 636540e9c40b
> Reported-by: Quentin Perret 

And FWIW, Tested-by: Quentin Perret 

Thanks Marc :)
Quentin

> Signed-off-by: Marc Zyngier 
> ---
>  target/arm/helper.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index a089fb5a69..027fffbff6 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -1934,8 +1934,11 @@ static uint64_t isr_read(CPUARMState *env, const 
> ARMCPRegInfo *ri)
>  CPUState *cs = env_cpu(env);
>  uint64_t hcr_el2 = arm_hcr_el2_eff(env);
>  uint64_t ret = 0;
> +bool allow_virt = (arm_current_el(env) == 1 &&
> +   (!arm_is_secure_below_el3(env) ||
> +(env->cp15.scr_el3 & SCR_EEL2)));
>  
> -if (hcr_el2 & HCR_IMO) {
> +if (allow_virt && (hcr_el2 & HCR_IMO)) {
>  if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
>  ret |= CPSR_I;
>  }
> @@ -1945,7 +1948,7 @@ static uint64_t isr_read(CPUARMState *env, const 
> ARMCPRegInfo *ri)
>  }
>  }
>  
> -if (hcr_el2 & HCR_FMO) {
> +if (allow_virt && (hcr_el2 & HCR_FMO)) {
>  if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
>  ret |= CPSR_F;
>  }
> -- 
> 2.17.1
>

Re: [PATCH 4/6] tests/test-util-filemonitor: Skip test on non-x86 Travis containers

2019-11-22 Thread Alex Bennée



Thomas Huth  writes:

> test-util-filemonitor fails in restricted non-x86 Travis containers
> since they apparently blacklisted some required system calls there.
> Let's simply skip the test if we detect such an environment.
>
> Signed-off-by: Thomas Huth 

Reviewed-by: Alex Bennée 

> ---
>  tests/test-util-filemonitor.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/tests/test-util-filemonitor.c b/tests/test-util-filemonitor.c
> index 301cd2db61..45009c69f4 100644
> --- a/tests/test-util-filemonitor.c
> +++ b/tests/test-util-filemonitor.c
> @@ -406,10 +406,21 @@ test_file_monitor_events(void)
>  char *pathdst = NULL;
>  QFileMonitorTestData data;
>  GHashTable *ids = g_hash_table_new(g_int64_hash, g_int64_equal);
> +char *travis_arch;
>  
>  qemu_mutex_init();
>  data.records = NULL;
>  
> +/*
> + * This test does not work on Travis LXD containers since some
> + * syscalls are blocked in that environment.
> + */
> +travis_arch = getenv("TRAVIS_ARCH");
> +if (travis_arch && !g_str_equal(travis_arch, "x86_64")) {
> +g_test_skip("Test does not work on non-x86 Travis containers.");
> +return;
> +}
> +
>  /*
>   * The file monitor needs the main loop running in
>   * order to receive events from inotify. We must


-- 
Alex Bennée

Re: [PATCH 3/6] tests/hd-geo-test: Skip test when images can not be created

2019-11-22 Thread Alex Bennée



Thomas Huth  writes:

> In certain environments like restricted containers, we can not create
> huge test images. To be able to use "make check" in such container
> environments, too, let's skip the hd-geo-test instead of failing when
> the test images could not be created.
>
> Signed-off-by: Thomas Huth 
> ---
>  tests/hd-geo-test.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c
> index 7e86c5416c..a249800544 100644
> --- a/tests/hd-geo-test.c
> +++ b/tests/hd-geo-test.c
> @@ -34,8 +34,13 @@ static char *create_test_img(int secs)
>  fd = mkstemp(template);
>  g_assert(fd >= 0);
>  ret = ftruncate(fd, (off_t)secs * 512);
> -g_assert(ret == 0);
>  close(fd);
> +
> +if (ret) {
> +free(template);
> +template = NULL;
> +}
> +
>  return template;
>  }
>  
> @@ -934,6 +939,10 @@ int main(int argc, char **argv)
>  for (i = 0; i < backend_last; i++) {
>  if (img_secs[i] >= 0) {
>  img_file_name[i] = create_test_img(img_secs[i]);
> +if (!img_file_name[i]) {
> +g_test_message("Could not create test images.");
> +goto test_add_done;
> +}
>  } else {
>  img_file_name[i] = NULL;
>  }
> @@ -965,6 +974,7 @@ int main(int argc, char **argv)
> "skipping hd-geo/override/* tests");
>  }
>  
> +test_add_done:
>  ret = g_test_run();

It does seem a bit odd to call g_test_run if we have explicitly not set
any up. Personally I'd hoist all the test creation into a new function
so you could do:

  if (setup_images()) {
 setup_tests();
 ret = run_tests();
  } else {
 ret = 0; /* pass if we have no images */
  }

  cleanup_images();

but that's just me going above and beyond to avoid goto's ;-)

Reviewed-by: Alex Bennée 

>  
>  for (i = 0; i < backend_last; i++) {


-- 
Alex Bennée

Re: [PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays

2019-11-22 Thread Peter Maydell

On Fri, 22 Nov 2019 at 16:08, Kevin Wolf  wrote:
>
> See patch 4 for the description of the bug fixed.

I guess my questions for trying to answer the "for-4.2?"
question in the subject are:
 1) is this a security (leaking data into the guest) bug ?
 2) is this a regression?
 3) is this something a lot of people are likely to run into?

Eyeballing of the diffstat plus the fact we're on v4 of
the patchset already makes me a little uneasy about
putting it into rc3, but if the bug we're fixing matters
enough we can do it.

thanks
-- PMM

Re: [PATCH] target/arm: Fix ISR_EL1 tracking when executing at EL2

2019-11-22 Thread Richard Henderson

On 11/22/19 2:16 PM, Peter Maydell wrote:
> RTH: vaguely wondering if this might be related to the
> bug you ran into trying to test your VHE emulation
> patchset...

Thanks for the thought.  It might be related, but it isn't the final cause:
the inner guest does not yet succeed including this patch.

r~

Re: [PATCH v2 1/5] s390x: Don't do a normal reset on the initial cpu

2019-11-22 Thread Cornelia Huck

On Fri, 22 Nov 2019 08:59:58 -0500
Janosch Frank  wrote:

> The initiating cpu needs to be reset with an initial reset. While
> doing a normal reset followed by a initial reset is not wron per-se,

s/wron per-se/wrong per se/

> the Ultravisor will only allow the correct reset to be performed.

So... the uv has stricter rules than the architecture has in that
respect?

> 
> Signed-off-by: Janosch Frank 
> Reviewed-by: David Hildenbrand 
> ---
>  hw/s390x/s390-virtio-ccw.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index d3edeef0ad..c1d1440272 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -348,6 +348,9 @@ static void s390_machine_reset(MachineState *machine)
>  break;
>  case S390_RESET_LOAD_NORMAL:
>  CPU_FOREACH(t) {
> +if (t == cs) {
> +continue;
> +}
>  run_on_cpu(t, s390_do_cpu_reset, RUN_ON_CPU_NULL);
>  }
>  subsystem_reset();

[PATCH v3 5/8] iotests: Add qemu_io_log()

2019-11-22 Thread Kevin Wolf

Add a function that runs qemu-io and logs the output with the
appropriate filters applied.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/iotests.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index df0708923d..fc78852ae5 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -162,6 +162,11 @@ def qemu_io(*args):
 sys.stderr.write('qemu-io received signal %i: %s\n' % (-exitcode, ' 
'.join(args)))
 return subp.communicate()[0]
 
+def qemu_io_log(*args):
+result = qemu_io(*args)
+log(result, filters=[filter_testfiles, filter_qemu_io])
+return result
+
 def qemu_io_silent(*args):
 '''Run qemu-io and return the exit code, suppressing stdout'''
 args = qemu_io_args + list(args)
-- 
2.20.1

[PATCH v3 7/8] iotests: Support job-complete in run_job()

2019-11-22 Thread Kevin Wolf

Automatically complete jobs that have a 'ready' state and need an
explicit job-complete. Without this, run_job() would hang for such
jobs.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/iotests.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 0ac3ad4b04..b46d298766 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -622,6 +622,8 @@ class VM(qtest.QEMUQtestMachine):
 error = j['error']
 if use_log:
 log('Job failed: %s' % (j['error']))
+elif status == 'ready':
+self.qmp_log('job-complete', id=job)
 elif status == 'pending' and not auto_finalize:
 if pre_finalize:
 pre_finalize()
-- 
2.20.1

Re: [Qemu-devel] [PATCH v4 03/14] qapi: Introduce default values for struct members

2019-11-22 Thread Kevin Wolf

Am 22.11.2019 um 15:40 hat Markus Armbruster geschrieben:
> Kevin Wolf  writes:
> 
> > Am 22.11.2019 um 08:29 hat Markus Armbruster geschrieben:
> >> > At any rate, your counterpoint is taken - whatever we pick, we'll want
> >> > to make sure that introspection can expose semantics, and whether we
> >> > can make '*' redundant with some other form of longhand in the qapi
> >> > language is in part determined by whether we also reflect that through
> >> > introspection.
> >> 
> >> Introspection has the true member name, without the '*' prefix.
> >> 
> >> We'll also want to avoid unnecessary compromises on QAPI schema
> >> expressiveness.  If we use null to mean "schema does not specify
> >> behavior when member is absent", we can't use it to mean "absent member
> >> behaves like the value null".  A bit of a blemish, but I think it's a
> >> tolerable one.
> >
> > If you want an example for an option that defaults to null, take the
> > backing option of BlockdevOptionsGenericCOWFormat.
> >
> > What is the reason for even considering limiting the expressiveness? Do
> > you think that an additional 'optional' bool, at least for those options
> > that don't have a default, would be so bad in the longhand form? Or
> > keeping '*' even in the longhand form, as suggested below.
> 
> Well, one reason is this:
> 
> ##
> # @SchemaInfoObjectMember:
> #
> # An object member.
> #
> # @name: the member's name, as defined in the QAPI schema.
> #
> # @type: the name of the member's type.
> #
> # @default: default when used as command parameter.
> #   If absent, the parameter is mandatory.
> #   If present, the value must be null.  The parameter is
> #   optional, and behavior when it's missing is not specified
> #   here.
> #   Future extension: if present and non-null, the parameter
> #   is optional, and defaults to this value.
> #
> # Since: 2.5
> ##
> 
> If we want to be able to express the difference between "behavior when
> absent is not specified here" and "absent behaves like value null", then
> we need to somehow add that bit of information here.
> 
> Could use a feature.  Features are not yet implemented for members, but
> we need them anyway.

That definition wasn't a great idea, I'm afraid. :-(

But "default is QNull" is still acceptable behaviour for "not
specified" if the client doesn't need to know what the default is.

> >> > If that means that keeping '*' in the longhand form of
> >> > optional members (whether or not those members have a default value),
> >> > then so be it.
> >> 
> >> I believe both
> >> 
> >> '*KEY': { 'type': ARG': 'default': null }
> >> 
> >> and
> >> 
> >> 'KEY': { 'type': ARG': 'default': null }
> >> 
> >> are viable longhand forms for '*KEY': 'ARG'.
> >> 
> >> I prefer the latter, but I'm open to arguments.
> >
> > If you go for the former, then you certainly want to use absent
> > 'default' to indicate no default, and allow a QNull default with
> > 'default': null.
> >
> > The only reason to abuse 'default': null for no default is that you
> > can't distinguish optional and non-optional if you use 'KEY' for both
> > instead of 'KEY' for mandatory and '*KEY' for optional.
> >
> > So while I understand and to some degree share your dislike for the '*'
> > prefix, I think I cast my pragmatic vote for:
> >
> > mandatory:   'KEY':  { 'type': 'ARG' }
> > optional without a default:  '*KEY': { 'type': 'ARG' }
> > optional with QNull default: '*KEY': { 'type': 'ARG', 'default': null }
> 
> The last one could also be 'KEY': { 'type': 'ARG', 'default': null }
> without loss of expressiveness.
> 
> Differently ugly.

Not loss of expressiveness, but loss of consistency.

> Here's yet another idea.  For the "absent is not specified here" case,
> use
> 
> 'KEY': { 'type': 'ARG', optional: true }
> '*KEY': 'ARG'
> 
> For the "absent defaults to DEFVAL" case, use
> 
> 'KEY': { 'type': 'ARG', optional: true, 'default': DEFVAL }
> 'KEY': { 'type': 'ARG', 'default': DEFVAL }

I assume this means: 'optional' defaults to true if 'default' is
present, and to false if 'default' is absent. (It's an example of a
default that can't be expressed in the schema.)

Works for me.

Kevin

[PATCH v3 4/8] block: truncate: Don't make backing file data visible

2019-11-22 Thread Kevin Wolf

When extending the size of an image that has a backing file larger than
its old size, make sure that the backing file data doesn't become
visible in the guest, but the added area is properly zeroed out.

Consider the following scenario where the overlay is shorter than its
backing file:

base.qcow2: 
overlay.qcow2:  

When resizing (extending) overlay.qcow2, the new blocks should not stay
unallocated and make the additional As from base.qcow2 visible like
before this patch, but zeros should be read.

A similar case happens with the various variants of a commit job when an
intermediate file is short (- for unallocated):

base.qcow2: A-A-
mid.qcow2:  BB-B
top.qcow2:  C--C--C-

After commit top.qcow2 to mid.qcow2, the following happens:

mid.qcow2:  CB-C00C0 (correct result)
mid.qcow2:  CB-C--C- (before this fix)

Without the fix, blocks that previously read as zeros on top.qcow2
suddenly turn into A.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/block/io.c b/block/io.c
index 42e7558954..61a63d9dc2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3392,12 +3392,45 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset, bool exact,
 ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not refresh total sector count");
+goto fail_refresh_total_sectors;
 } else {
 offset = bs->total_sectors * BDRV_SECTOR_SIZE;
 }
+
+/*
+ * If the image has a backing file that is large enough that it would
+ * provide data for the new area, we cannot leave it unallocated because
+ * then the backing file content would become visible. Instead, zero-fill
+ * the area where backing file and new area overlap.
+ *
+ * Note that if the image has a backing file, but was opened without the
+ * backing file, taking care of keeping things consistent with that backing
+ * file is the user's responsibility.
+ */
+if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
+int64_t backing_len;
+
+backing_len = bdrv_getlength(backing_bs(bs));
+if (backing_len < 0) {
+ret = backing_len;
+goto fail_refresh_total_sectors;
+}
+
+if (backing_len > old_size) {
+ret = bdrv_co_do_pwrite_zeroes(
+bs, old_size, MIN(new_bytes, backing_len - old_size),
+BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP |
+(no_fallback ? BDRV_REQ_NO_FALLBACK : 0));
+if (ret < 0) {
+goto fail_refresh_total_sectors;
+}
+}
+}
+
 /* It's possible that truncation succeeded but refresh_total_sectors
  * failed, but the latter doesn't affect how we should finish the request.
  * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */
+fail_refresh_total_sectors:
 bdrv_co_write_req_finish(child, offset - new_bytes, new_bytes, , 0);
 
 out:
-- 
2.20.1

[PATCH v3 6/8] iotests: Fix timeout in run_job()

2019-11-22 Thread Kevin Wolf

run_job() accepts a wait parameter for a timeout, but it doesn't
actually use it. The only thing that is missing is passing it to
events_wait(), so do that now.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/iotests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index fc78852ae5..0ac3ad4b04 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -609,7 +609,7 @@ class VM(qtest.QEMUQtestMachine):
 ]
 error = None
 while True:
-ev = filter_qmp_event(self.events_wait(events))
+ev = filter_qmp_event(self.events_wait(events, timeout=wait))
 if ev['event'] != 'JOB_STATUS_CHANGE':
 if use_log:
 log(ev)
-- 
2.20.1

[PATCH v3 8/8] iotests: Test committing to short backing file

2019-11-22 Thread Kevin Wolf

Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/274| 152 +
 tests/qemu-iotests/274.out| 203 ++
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |   2 +-
 4 files changed, 357 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/274
 create mode 100644 tests/qemu-iotests/274.out

diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
new file mode 100755
index 00..7b238f41da
--- /dev/null
+++ b/tests/qemu-iotests/274
@@ -0,0 +1,152 @@
+#!/usr/bin/env python
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# Creator/Owner: Kevin Wolf 
+#
+# Some tests for short backing files and short overlays
+
+import iotests
+import os
+
+iotests.verify_image_format(supported_fmts=['qcow2'])
+iotests.verify_platform(['linux'])
+
+size_short = 1 * 1024 * 1024
+size_long = 2 * 1024 * 1024
+size_diff = size_long - size_short
+
+def create_chain():
+iotests.qemu_img_log('create', '-f', iotests.imgfmt, base,
+ str(size_long))
+iotests.qemu_img_log('create', '-f', iotests.imgfmt, '-b', base, mid,
+ str(size_short))
+iotests.qemu_img_log('create', '-f', iotests.imgfmt, '-b', mid, top,
+ str(size_long))
+
+iotests.qemu_io_log('-c', 'write -P 1 0 %d' % size_long, base)
+
+def create_vm():
+vm = iotests.VM()
+vm.add_blockdev('file,filename=%s,node-name=base-file' % (base))
+vm.add_blockdev('%s,file=base-file,node-name=base' % (iotests.imgfmt))
+vm.add_blockdev('file,filename=%s,node-name=mid-file' % (mid))
+vm.add_blockdev('%s,file=mid-file,node-name=mid,backing=base' % 
(iotests.imgfmt))
+vm.add_drive(top, 'backing=mid,node-name=top')
+return vm
+
+with iotests.FilePath('base') as base, \
+ iotests.FilePath('mid') as mid, \
+ iotests.FilePath('top') as top:
+
+iotests.log('== Commit tests ==')
+
+create_chain()
+
+iotests.log('=== Check visible data ===')
+
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, top)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), top)
+
+iotests.log('=== Checking allocation status ===')
+
+iotests.qemu_io_log('-c', 'alloc 0 %d' % size_short,
+'-c', 'alloc %d %d' % (size_short, size_diff),
+base)
+
+iotests.qemu_io_log('-c', 'alloc 0 %d' % size_short,
+'-c', 'alloc %d %d' % (size_short, size_diff),
+mid)
+
+iotests.qemu_io_log('-c', 'alloc 0 %d' % size_short,
+'-c', 'alloc %d %d' % (size_short, size_diff),
+top)
+
+iotests.log('=== Checking map ===')
+
+iotests.qemu_img_log('map', '--output=json', base)
+iotests.qemu_img_log('map', '--output=human', base)
+iotests.qemu_img_log('map', '--output=json', mid)
+iotests.qemu_img_log('map', '--output=human', mid)
+iotests.qemu_img_log('map', '--output=json', top)
+iotests.qemu_img_log('map', '--output=human', top)
+
+iotests.log('=== Testing qemu-img commit (top -> mid) ===')
+
+iotests.qemu_img_log('commit', top)
+iotests.img_info_log(mid)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
+
+iotests.log('=== Testing HMP commit (top -> mid) ===')
+
+create_chain()
+with create_vm() as vm:
+vm.launch()
+vm.qmp_log('human-monitor-command', command_line='commit drive0')
+
+iotests.img_info_log(mid)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
+
+iotests.log('=== Testing QMP active commit (top -> mid) ===')
+
+create_chain()
+with create_vm() as vm:
+vm.launch()
+vm.qmp_log('block-commit', device='top', base_node='mid',
+   job_id='job0', auto_dismiss=False)
+vm.run_job('job0', wait=5)
+
+iotests.img_info_log(mid)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
+
+
+iotests.log('== Resize tests ==')
+
+# Use different sizes for different allocation modes:

[PATCH v3 3/8] qcow2: Declare BDRV_REQ_NO_FALLBACK supported

2019-11-22 Thread Kevin Wolf

In the common case, qcow2_co_pwrite_zeroes() already only modifies
metadata case, so we're fine with or without BDRV_REQ_NO_FALLBACK set.

The only exception is when using an external data file, where the
request is passed down to the block driver of the external data file. We
are forwarding the BDRV_REQ_NO_FALLBACK flag there, though, so this is
fine, too.

Declare the flag supported therefore.

Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b201383c3d..3fa10bf807 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1722,7 +1722,8 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 }
 }
 
-bs->supported_zero_flags = header.version >= 3 ? BDRV_REQ_MAY_UNMAP : 0;
+bs->supported_zero_flags = header.version >= 3 ?
+   BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK : 0;
 
 /* Repair image if dirty */
 if (!(flags & (BDRV_O_CHECK | BDRV_O_INACTIVE)) && !bs->read_only &&
-- 
2.20.1

[PATCH v3 2/8] block: Add no_fallback parameter to bdrv_co_truncate()

2019-11-22 Thread Kevin Wolf

This adds a no_fallback parameter to bdrv_co_truncate(), bdrv_truncate()
and blk_truncate() in preparation for a fix that potentially needs to
zero-write the new area. no_fallback will use BDRV_REQ_NO_FALLBACK for
this operation and lets the truncate fail if an efficient zero write
isn't possible.

Only qmp_block_resize() passes true for this parameter because it is a
blocking monitor command, so we don't want to add more potentially slow
I/O operations to it than we already have.

All other users will accept even a slow fallback to avoid failure.

Signed-off-by: Kevin Wolf 
---
 include/block/block.h  |  5 +++--
 include/sysemu/block-backend.h |  2 +-
 block/block-backend.c  |  4 ++--
 block/commit.c |  4 ++--
 block/crypto.c |  4 ++--
 block/io.c | 16 
 block/mirror.c |  2 +-
 block/parallels.c  |  6 +++---
 block/qcow.c   |  4 ++--
 block/qcow2-refcount.c |  2 +-
 block/qcow2.c  | 19 +++
 block/qed.c|  2 +-
 block/raw-format.c |  2 +-
 block/vdi.c|  2 +-
 block/vhdx-log.c   |  2 +-
 block/vhdx.c   |  6 +++---
 block/vmdk.c   | 10 ++
 block/vpc.c|  2 +-
 blockdev.c |  2 +-
 qemu-img.c |  2 +-
 qemu-io-cmds.c |  2 +-
 tests/test-block-iothread.c|  6 +++---
 22 files changed, 60 insertions(+), 46 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 1df9848e74..3e44677905 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -347,9 +347,10 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState 
*bs,
 void bdrv_refresh_filename(BlockDriverState *bs);
 
 int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
-  PreallocMode prealloc, Error **errp);
+  PreallocMode prealloc, bool no_fallback,
+  Error **errp);
 int bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
-  PreallocMode prealloc, Error **errp);
+  PreallocMode prealloc, bool no_fallback, Error **errp);
 
 int64_t bdrv_nb_sectors(BlockDriverState *bs);
 int64_t bdrv_getlength(BlockDriverState *bs);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b198deca0b..487b29d13e 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -238,7 +238,7 @@ int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, 
int64_t offset,
 int blk_pwrite_compressed(BlockBackend *blk, int64_t offset, const void *buf,
   int bytes);
 int blk_truncate(BlockBackend *blk, int64_t offset, bool exact,
- PreallocMode prealloc, Error **errp);
+ PreallocMode prealloc, bool no_fallback, Error **errp);
 int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes);
 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
  int64_t pos, int size);
diff --git a/block/block-backend.c b/block/block-backend.c
index 8b8f2a80a0..fcc9d60cdb 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2073,14 +2073,14 @@ int blk_pwrite_compressed(BlockBackend *blk, int64_t 
offset, const void *buf,
 }
 
 int blk_truncate(BlockBackend *blk, int64_t offset, bool exact,
- PreallocMode prealloc, Error **errp)
+ PreallocMode prealloc, bool no_fallback, Error **errp)
 {
 if (!blk_is_available(blk)) {
 error_setg(errp, "No medium inserted");
 return -ENOMEDIUM;
 }
 
-return bdrv_truncate(blk->root, offset, exact, prealloc, errp);
+return bdrv_truncate(blk->root, offset, exact, prealloc, no_fallback, 
errp);
 }
 
 static void blk_pdiscard_entry(void *opaque)
diff --git a/block/commit.c b/block/commit.c
index 23c90b3b91..f074181d83 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -155,7 +155,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
 }
 
 if (base_len < len) {
-ret = blk_truncate(s->base, len, false, PREALLOC_MODE_OFF, NULL);
+ret = blk_truncate(s->base, len, false, PREALLOC_MODE_OFF, false, 
NULL);
 if (ret) {
 goto out;
 }
@@ -472,7 +472,7 @@ int bdrv_commit(BlockDriverState *bs)
  * we must return an error */
 if (length > backing_length) {
 ret = blk_truncate(backing, length, false, PREALLOC_MODE_OFF,
-   _err);
+   false, _err);
 if (ret < 0) {
 error_report_err(local_err);
 goto ro_cleanup;
diff --git a/block/crypto.c b/block/crypto.c
index 24823835c1..0f28e8b4e1 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -114,7 +114,7 @@ static ssize_t

[PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays

2019-11-22 Thread Kevin Wolf

See patch 4 for the description of the bug fixed.

v3:
- Don't allow blocking the monitor for a zero write in block_resize
  (even though we can already blockfor other reasons there). This is
  mainly responsible for the increased complexity compared to v2.
  Personally, I think this is not an improvement over v2, but if this is
  what it takes to fix a corruption issue in 4.2... [Max]
- Don't use huge image files in the test case [Vladimir]

v2:
- Switched order of bs->total_sectors update and zero write [Vladimir]
- Fixed coding style [Vladimir]
- Changed the commit message to contain what was in the cover letter
- Test all preallocation modes
- Test allocation status with qemu-io 'map' [Vladimir]

Kevin Wolf (8):
  block: bdrv_co_do_pwrite_zeroes: 64 bit 'bytes' parameter
  block: Add no_fallback parameter to bdrv_co_truncate()
  qcow2: Declare BDRV_REQ_NO_FALLBACK supported
  block: truncate: Don't make backing file data visible
  iotests: Add qemu_io_log()
  iotests: Fix timeout in run_job()
  iotests: Support job-complete in run_job()
  iotests: Test committing to short backing file

 include/block/block.h  |   5 +-
 include/sysemu/block-backend.h |   2 +-
 block/block-backend.c  |   4 +-
 block/commit.c |   4 +-
 block/crypto.c |   4 +-
 block/io.c |  55 +++--
 block/mirror.c |   2 +-
 block/parallels.c  |   6 +-
 block/qcow.c   |   4 +-
 block/qcow2-refcount.c |   2 +-
 block/qcow2.c  |  22 ++--
 block/qed.c|   2 +-
 block/raw-format.c |   2 +-
 block/vdi.c|   2 +-
 block/vhdx-log.c   |   2 +-
 block/vhdx.c   |   6 +-
 block/vmdk.c   |  10 +-
 block/vpc.c|   2 +-
 blockdev.c |   2 +-
 qemu-img.c |   2 +-
 qemu-io-cmds.c |   2 +-
 tests/test-block-iothread.c|   6 +-
 tests/qemu-iotests/274 | 152 
 tests/qemu-iotests/274.out | 203 +
 tests/qemu-iotests/group   |   1 +
 tests/qemu-iotests/iotests.py  |  11 +-
 26 files changed, 463 insertions(+), 52 deletions(-)
 create mode 100755 tests/qemu-iotests/274
 create mode 100644 tests/qemu-iotests/274.out

-- 
2.20.1

Re: [QUESTION] Usage of '0b' as a prefix for numerical constants?

2019-11-22 Thread Peter Maydell

On Fri, 22 Nov 2019 at 14:57, Aleksandar Markovic
 wrote:
> I remember a while ago, something stopped me from using '0b' as a
> prefix in my own code (was it checkpatch.pl, or perhaps some statement
> on coding style, or a compiler, or something else - I don't really
> remember), so I didn't use it, and used '0x' (hexadecimal constant).
>
> What is really the view of the community on usage of '0b'?

I used to be somewhat against it/uncertain, as I wasn't sure how
widely portable it was (as Eric says, it's a gccism, which isn't
inherently a problem for QEMU code but it makes it a
bit less certain in the general case whether all the versions
of gcc and clang we care about have it). But that was some
time ago, and for 0b... we have plenty of existing use in the
tree so we can be confident that it's portable-enough for us.

I agree with Philippe that whether to prefer a hex constant
or a binary one (or a decimal one, for that matter) is basically
a situational question -- aim for whichever seems to make the
intention clear, be most readable, and match up with whatever
notation the official specification uses, if applicable.

PS: for expressions like
 (((inst >> 22) & 0b111000) | ((inst >> 12) & 0b000111))
it may be preferable to use something like
 (extract32(insn, 25, 7) << 3) | extract32(insn, 12, 3)
rather than raw bitfield manipulation; again this is a
judgement call based on what seems more readable
and perhaps how a specification chooses to phrase it.

thanks
-- PMM

[PATCH v3 1/8] block: bdrv_co_do_pwrite_zeroes: 64 bit 'bytes' parameter

2019-11-22 Thread Kevin Wolf

bdrv_co_do_pwrite_zeroes() can already cope with maximum request sizes
by calling the driver in a loop until everything is done. Make the small
remaining change that is necessary to let it accept a 64 bit byte count.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
---
 block/io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index f75777f5ea..003f4ea38c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -42,7 +42,7 @@
 
 static void bdrv_parent_cb_resize(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
-int64_t offset, int bytes, BdrvRequestFlags flags);
+int64_t offset, int64_t bytes, BdrvRequestFlags flags);
 
 static void bdrv_parent_drained_begin(BlockDriverState *bs, BdrvChild *ignore,
   bool ignore_bds_parents)
@@ -1730,7 +1730,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
 }
 
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
-int64_t offset, int bytes, BdrvRequestFlags flags)
+int64_t offset, int64_t bytes, BdrvRequestFlags flags)
 {
 BlockDriver *drv = bs->drv;
 QEMUIOVector qiov;
@@ -1760,7 +1760,7 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 assert(max_write_zeroes >= bs->bl.request_alignment);
 
 while (bytes > 0 && !ret) {
-int num = bytes;
+int num = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
 
 /* Align request.  Block drivers can expect the "bulk" of the request
  * to be aligned, and that unaligned requests do not cross cluster
-- 
2.20.1

Re: [RESEND PATCH v21 3/6] ACPI: Add APEI GHES table generation support

2019-11-22 Thread Beata Michalska

Hi,

On Mon, 18 Nov 2019 at 12:50, gengdongjiu  wrote:
>
> Hi,Igor,
>Thanks for you review and time.
>
> >
> >> +/*
> >> + * Type:
> >> + * Generic Hardware Error Source version 2(GHESv2 - Type 10)
> >> + */
> >> +build_append_int_noprefix(table_data, 
> >> ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
> >> +/*
> >> + * Source Id
> >
> >> + * Once we support more than one hardware error sources, we need to
> >> + * increase the value of this field.
> > I'm not sure ^^^ is correct, according to spec it's just unique id per
> > distinct error structure, so we just assign arbitrary values to each
> > declared source and that never changes once assigned.
> The source id is used to distinct the error source, for each source， the 
> ‘source id’ is unique，
> but different source has different source id. for example, the 'source id' of 
> the error source 0 is 0,
> the 'source id' of the error source 1 is 1.
>

I might be wrong but the source id is not a sequence number and it can
have any value as long
as it is unique and the comment 're 'increasing the number' reads bit wrong.

>
> >
> > For now I'd make source_id an enum with one member
> >   enum {
> > ACPI_HEST_SRC_ID_SEA = 0,
> > /* future ids go here */
> > ACPI_HEST_SRC_ID_RESERVED,
> >   }
> If we only have one error source, we can use enum instead of allocating magic 
> 0.
> But if we have more error source , such as 10 error source. using enum  maybe 
> not a good idea.
>
> for example, if there are 10 error sources, I can just using below loop
>
> for(i=0; i< 10; i++)
>build_ghes_v2（source_id++）;
>

You can do that but using enum makes it more readable and maintainable.
Also you can keep the source id as a sequence number but still represent that
with enum, as it has been suggested, and use the 'RESERVED' field for
loop control.
I think it might be also worth to represent the HES type as enum as well :
enum{
ACPI_HES_TYPE_GHESv2 = 10,

};

> >
> > and use that instead of allocating magic 0 at the beginning of the function.
> >  build_ghes_v2(ACPI_HEST_GHES_SEA);
> > Also add a comment to declaration that already assigned values are not to 
> > be changed
> >
> >> + */
> >> +build_append_int_noprefix(table_data, source_id, 2);
> >> +/* Related Source Id */
> >> +build_append_int_noprefix(table_data, 0x, 2);
> >> +/* Flags */
> >> +build_append_int_noprefix(table_data, 0, 1);
> >> +/* Enabled */
> >> +build_append_int_noprefix(table_data, 1, 1);
> >> +
> >> +/* Number of Records To Pre-allocate */
> >> +build_append_int_noprefix(table_data, 1, 4);
> >> +/* Max Sections Per Record */
> >> +build_append_int_noprefix(table_data, 1, 4);
> >> +/* Max Raw Data Length */
> >> +build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 
> >> 4);
> >> +
> >> +/* Error Status Address */
> >> +build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
> >> + 4 /* QWord access */, 0);
> >> +bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
> >> +ACPI_GHES_ERROR_STATUS_ADDRESS_OFFSET(hest_start, source_id),
> > it's fine only if GHESv2 is the only entries in HEST, but once
> > other types are added this macro will silently fall apart and
> > cause table corruption.
> >
> > Instead of offset from hest_start, I suggest to use offset relative
> > to GAS structure, here is an idea
> >
> > #define GAS_ADDR_OFFSET 4
> >
> > off = table->len
> > build_append_gas()
> > bios_linker_loader_add_pointer(...,
> > off + GAS_ADDR_OFFSET, ...
> I think your suggestion is good.
>
> >
> >> +ACPI_GHES_ADDRESS_SIZE, ACPI_GHES_ERRORS_FW_CFG_FILE,
> >> +source_id * ACPI_GHES_ADDRESS_SIZE);
> >> +
> >> +/*
> >> + * Notification Structure
> >> + * Now only enable ARMv8 SEA notification type
> >> + */
> >> +acpi_ghes_build_notify(table_data, ACPI_GHES_NOTIFY_SEA);
> >> +
> >> +/* Error Status Block Length */
> >> +build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 
> >> 4);
> >> +
> >> +/*
> >> + * Read Ack Register
> >> + * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
> >> + * version 2 (GHESv2 - Type 10)
> >> + */
> >> +build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
> >> + 4 /* QWord access */, 0);
> >> +bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
> >> +ACPI_GHES_READ_ACK_REGISTER_ADDRESS_OFFSET(hest_start, 0),
> > ditto
> >
> >> +ACPI_GHES_ADDRESS_SIZE, ACPI_GHES_ERRORS_FW_CFG_FILE,
> >> +(ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * 
> >> ACPI_GHES_ADDRESS_SIZE);
> >> +
> >> +/*
> >> + * Read Ack Preserve
> >> + * We only provide the first bit in Read Ack Register to OSPM to write
> >> + * while the other bits are preserved.
> >> + */
> >> +build_append_int_noprefix(table_data, ~0x1ULL, 8);
> >> +/*

Re: [RESEND PATCH v21 3/6] ACPI: Add APEI GHES table generation support

2019-11-22 Thread Beata Michalska

Hi Xiang,

On Mon, 11 Nov 2019 at 01:48, Xiang Zheng  wrote:
>
> From: Dongjiu Geng 
>
> This patch implements APEI GHES Table generation via fw_cfg blobs. Now
> it only supports ARMv8 SEA, a type of GHESv2 error source. Afterwards,
> we can extend the supported types if needed. For the CPER section,
> currently it is memory section because kernel mainly wants userspace to
> handle the memory errors.
>
> This patch follows the spec ACPI 6.2 to build the Hardware Error Source
> table. For more detailed information, please refer to document:
> docs/specs/acpi_hest_ghes.rst
>
> Suggested-by: Laszlo Ersek 
> Signed-off-by: Dongjiu Geng 
> Signed-off-by: Xiang Zheng 
> Reviewed-by: Michael S. Tsirkin 
> ---
>  default-configs/arm-softmmu.mak |   1 +
>  hw/acpi/Kconfig |   4 +
>  hw/acpi/Makefile.objs   |   1 +
>  hw/acpi/acpi_ghes.c | 267 
>  hw/acpi/aml-build.c |   2 +
>  hw/arm/virt-acpi-build.c|  12 ++
>  include/hw/acpi/acpi_ghes.h |  56 +++
>  include/hw/acpi/aml-build.h |   1 +
>  8 files changed, 344 insertions(+)
>  create mode 100644 hw/acpi/acpi_ghes.c
>  create mode 100644 include/hw/acpi/acpi_ghes.h
>
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 1f2e0e7fde..5722f3130e 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -40,3 +40,4 @@ CONFIG_FSL_IMX25=y
>  CONFIG_FSL_IMX7=y
>  CONFIG_FSL_IMX6UL=y
>  CONFIG_SEMIHOSTING=y
> +CONFIG_ACPI_APEI=y
> diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
> index 12e3f1e86e..ed8c34d238 100644
> --- a/hw/acpi/Kconfig
> +++ b/hw/acpi/Kconfig
> @@ -23,6 +23,10 @@ config ACPI_NVDIMM
>  bool
>  depends on ACPI
>
> +config ACPI_APEI
> +bool
> +depends on ACPI
> +
>  config ACPI_PCI
>  bool
>  depends on ACPI && PCI
> diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
> index 655a9c1973..84474b0ca8 100644
> --- a/hw/acpi/Makefile.objs
> +++ b/hw/acpi/Makefile.objs
> @@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
>  common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
>  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
>  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
> +common-obj-$(CONFIG_ACPI_APEI) += acpi_ghes.o

Minor: The 'acpi' prefix could be dropped - it does not seem to be used
for other files (self impliend by the dir name).
This also applies to most of the naming within this patch

>  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
>  common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
>  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
> diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c
> new file mode 100644
> index 00..42c00ff3d3
> --- /dev/null
> +++ b/hw/acpi/acpi_ghes.c
> @@ -0,0 +1,267 @@
> +/*
> + * Support for generating APEI tables and recording CPER for Guests
> + *
> + * Copyright (c) 2019 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * Author: Dongjiu Geng 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/acpi/acpi_ghes.h"
> +#include "hw/nvram/fw_cfg.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu/error-report.h"
> +
> +#define ACPI_GHES_ERRORS_FW_CFG_FILE"etc/hardware_errors"
> +#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
> +
> +/*
> + * The size of Address field in Generic Address Structure.
> + * ACPI 2.0/3.0: 5.2.3.1 Generic Address Structure.
> + */
> +#define ACPI_GHES_ADDRESS_SIZE  8
> +
As already mentioned, you can safely drop this and use sizeof(unit64_t).

> +/* The max size in bytes for one error block */
> +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   0x1000
> +
> +/*
> + * Now only support ARMv8 SEA notification type error source
> + */
> +#define ACPI_GHES_ERROR_SOURCE_COUNT1
> +
> +/*
> + * Generic Hardware Error Source version 2
> + */
> +#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10

Minor: this is actually a type so would be good if the name would
reflect that somehow..

> +
> +/*
> + * | +--+ 0
> + * | |Header|
> + * | +--+ 40---+-
> + * | | .|  |
> + * | |

Re: [PATCH v35 10/13] target/avr: Add limited support for USART and 16 bit timer peripherals

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 3:41 PM, Aleksandar Markovic wrote:

On Tue, Oct 29, 2019 at 10:25 PM Michael Rolnik  wrote:


From: Sarah Harris 

These were designed to facilitate testing but should provide enough function to 
be useful in other contexts.
Only a subset of the functions of each peripheral is implemented, mainly due to 
the lack of a standard way to handle electrical connections (like GPIO pins).

Signed-off-by: Sarah Harris 
---
  hw/char/Kconfig|   3 +
  hw/char/Makefile.objs  |   1 +
  hw/char/avr_usart.c| 324 ++
  hw/misc/Kconfig|   3 +
  hw/misc/Makefile.objs  |   2 +
  hw/misc/avr_mask.c | 112 ++
  hw/timer/Kconfig   |   3 +
  hw/timer/Makefile.objs |   2 +
  hw/timer/avr_timer16.c | 605 +
  include/hw/char/avr_usart.h|  97 ++
  include/hw/misc/avr_mask.h |  47 +++
  include/hw/timer/avr_timer16.h |  97 ++
  12 files changed, 1296 insertions(+)
  create mode 100644 hw/char/avr_usart.c
  create mode 100644 hw/misc/avr_mask.c
  create mode 100644 hw/timer/avr_timer16.c
  create mode 100644 include/hw/char/avr_usart.h
  create mode 100644 include/hw/misc/avr_mask.h
  create mode 100644 include/hw/timer/avr_timer16.h

diff --git a/hw/char/Kconfig b/hw/char/Kconfig
index 40e7a8b8bb..331b20983f 100644
--- a/hw/char/Kconfig
+++ b/hw/char/Kconfig
@@ -46,3 +46,6 @@ config SCLPCONSOLE

  config TERMINAL3270
  bool
+
+config AVR_USART
+bool
diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index 02d8a66925..f05c1f5667 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -21,6 +21,7 @@ obj-$(CONFIG_PSERIES) += spapr_vty.o
  obj-$(CONFIG_DIGIC) += digic-uart.o
  obj-$(CONFIG_STM32F2XX_USART) += stm32f2xx_usart.o
  obj-$(CONFIG_RASPI) += bcm2835_aux.o
+common-obj-$(CONFIG_AVR_USART) += avr_usart.o

  common-obj-$(CONFIG_CMSDK_APB_UART) += cmsdk-apb-uart.o
  common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
diff --git a/hw/char/avr_usart.c b/hw/char/avr_usart.c
new file mode 100644
index 00..9ca3c2a1cd
--- /dev/null
+++ b/hw/char/avr_usart.c
@@ -0,0 +1,324 @@
+/*
+ * AVR USART
+ *
+ * Copyright (c) 2018 University of Kent
+ * Author: Sarah Harris
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/char/avr_usart.h"
+#include "qemu/log.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+
+static int avr_usart_can_receive(void *opaque)
+{
+AVRUsartState *usart = opaque;
+
+if (usart->data_valid || !(usart->csrb & USART_CSRB_RXEN)) {
+return 0;
+}
+return 1;


Here we tell the chardev frontend that we can receive at most 1 byte, ...


+}
+
+static void avr_usart_receive(void *opaque, const uint8_t *buffer, int size)
+{
+AVRUsartState *usart = opaque;
+assert(size == 1);


... so this condition is true, the frontend will never provide us more 
than 1 byte.




Hello, Michael.

I see the line "assert(size == 1);" is used here, and in really numerous
places in USART emulation (as a rule, at the very beginnings of function
bodies). Could you explain to me the justification for that line? Is there
a place in documentation that would expain the need for it? If this is
justified, why is there the need for argument "int size" in corresponding
functions? If some external rule/API forces you to have that argument for
all such functions, can you tell me what rule/API is that?


Some backends have FIFO queues, so can process more chars at once.



Yours,
Aleksandar


+assert(!usart->data_valid);
+usart->data = buffer[0];


Here the model consumes the 1st char of an array of at most 1 byte.

I suppose Sarah wanted to be sure we are not dropping characters.


+usart->data_valid = true;
+usart->csra |= USART_CSRA_RXC;
+if (usart->csrb & USART_CSRB_RXCIE) {
+qemu_set_irq(usart->rxc_irq, 1);
+}
+}

Re: [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM

2019-11-22 Thread Beata Michalska

Hi,

On Mon, 11 Nov 2019 at 01:48, Xiang Zheng  wrote:
>
> From: Dongjiu Geng 
>
> Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
> translates the host VA delivered by host to guest PA, then fills this PA
> to guest APEI GHES memory, then notifies guest according to the SIGBUS
> type.
>
> When guest accesses the poisoned memory, it will generate a Synchronous
> External Abort(SEA). Then host kernel gets an APEI notification and calls
> memory_failure() to unmapped the affected page in stage 2, finally
> returns to guest.
>
> Guest continues to access the PG_hwpoison page, it will trap to KVM as
> stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
> Qemu, Qemu records this error address into guest APEI GHES memory and
> notifes guest using Synchronous-External-Abort(SEA).
>
> In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
> in which we can setup the type of exception and the syndrome information.
> When switching to guest, the target vcpu will jump to the synchronous
> external abort vector table entry.
>
> The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
> ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
> not valid and hold an UNKNOWN value. These values will be set to KVM
> register structures through KVM_SET_ONE_REG IOCTL.
>
> Signed-off-by: Dongjiu Geng 
> Signed-off-by: Xiang Zheng 
> Reviewed-by: Michael S. Tsirkin 
> ---
>  hw/acpi/acpi_ghes.c | 297 
>  include/hw/acpi/acpi_ghes.h |   4 +
>  include/sysemu/kvm.h|   3 +-
>  target/arm/cpu.h|   4 +
>  target/arm/helper.c |   2 +-
>  target/arm/internals.h  |   5 +-
>  target/arm/kvm64.c  |  64 
>  target/arm/tlb_helper.c |   2 +-
>  target/i386/cpu.h   |   2 +
>  9 files changed, 377 insertions(+), 6 deletions(-)
>
> diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c
> index 42c00ff3d3..f5b54990c0 100644
> --- a/hw/acpi/acpi_ghes.c
> +++ b/hw/acpi/acpi_ghes.c
> @@ -39,6 +39,34 @@
>  /* The max size in bytes for one error block */
>  #define ACPI_GHES_MAX_RAW_DATA_LENGTH   0x1000
>
> +/*
> + * The total size of Generic Error Data Entry
> + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> + * Table 18-343 Generic Error Data Entry
> + */
> +#define ACPI_GHES_DATA_LENGTH   72
> +
> +/*
> + * The memory section CPER size,
> + * UEFI 2.6: N.2.5 Memory Error Section
> + */
> +#define ACPI_GHES_MEM_CPER_LENGTH   80
> +
> +/*
> + * Masks for block_status flags
> + */
> +#define ACPI_GEBS_UNCORRECTABLE 1

Why not listing all supported statuses ? Similar to error severity below ?

> +
> +/*
> + * Values for error_severity field
> + */
> +enum AcpiGenericErrorSeverity {
> +ACPI_CPER_SEV_RECOVERABLE,
> +ACPI_CPER_SEV_FATAL,
> +ACPI_CPER_SEV_CORRECTED,
> +ACPI_CPER_SEV_NONE,
> +};
> +
>  /*
>   * Now only support ARMv8 SEA notification type error source
>   */
> @@ -49,6 +77,16 @@
>   */
>  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
>
> +#define UUID_BE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)\
> +{{{ ((a) >> 24) & 0xff, ((a) >> 16) & 0xff, ((a) >> 8) & 0xff, (a) & 
> 0xff, \
> +((b) >> 8) & 0xff, (b) & 0xff,   \
> +((c) >> 8) & 0xff, (c) & 0xff,\
> +(d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } } }
> +
> +#define UEFI_CPER_SEC_PLATFORM_MEM   \
> +UUID_BE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
> +0xED, 0x7C, 0x83, 0xB1)
> +
>  /*
>   * | +--+ 0
>   * | |Header|
> @@ -77,6 +115,174 @@ typedef struct AcpiGhesState {
>  uint64_t ghes_addr_le;
>  } AcpiGhesState;
>
> +/*
> + * Total size for Generic Error Status Block
> + * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
> + * Table 18-380 Generic Error Status Block
> + */
> +#define ACPI_GHES_GESB_SIZE 20

Minor: This is not entirely correct: GEDE is part of GESB so the total length
would be ACPI_GHES_GESB_SIZE + n* sizeof(GEDE)

> +/* The offset of Data Length in Generic Error Status Block */
> +#define ACPI_GHES_GESB_DATA_LENGTH_OFFSET   12
> +

If those were nicely represented as structures you get the offsets easily
without having number of defines. That could simplify the code and make it
more readable - see comments below

> +/*
> + * Record the value of data length for each error status block to avoid 
> getting
> + * this value from guest.
> + */
> +static uint32_t acpi_ghes_data_length[ACPI_GHES_ERROR_SOURCE_COUNT];
> +
> +/*
> + * Generic Error Data Entry
> + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> + */
> +static void acpi_ghes_generic_error_data(GArray *table, QemuUUID 
> section_type,
> +uint32_t error_severity, uint16_t revision,
> +uint8_t validation_bits, uint8_t flags,
> +uint32_t error_data_length, QemuUUID fru_id,
> +

Re: [PATCH] ipmi: add SET_SENSOR_READING command

2019-11-22 Thread Cédric Le Goater

On 22/11/2019 15:28, Corey Minyard wrote:
> On Mon, Nov 18, 2019 at 10:24:29AM +0100, Cédric Le Goater wrote:
>> SET_SENSOR_READING is a complex IPMI command (see IPMI spec 35.17)
>> which enables the host software to set the reading value and the event
>> status of sensors supporting it.
>>
>> Below is a proposal for all the operations (reading, assert, deassert,
>> event data) with the following limitations :
>>
>>  - No event are generated for threshold-based sensors.
>>  - The case in which the BMC needs to generate its own events is not
>>supported.
> 
> Ok, I've included this in my tree.  I made one small change mentioned
> below.  Beyond that, I think you could make this function shorter, but I
> think that would actually make it harder to understand.  Breaking it
> into multiple functions doesn't make sense to me, either.
> 
> If you are including this in the ppc tree:
> 
> Acked-by: Corey Minyard 
> 
> with the change below and I can remove it from mine.

I don't think there is a strong need to have it in the PPC tree. It's 
a stand alone function adding an extra IPMI command.


>> Signed-off-by: Cédric Le Goater 
>> Reviewed-by: Corey Minyard 
>> ---
>> +
>> +switch (do_gen_event) {
>> +case SENSOR_GEN_EVENT_DATA: {
>> +unsigned int bit = evd1 & 0xf;
>> +uint16_t mask = (1 << bit);
>> +
>> +if (sens->assert_states & mask & sens->assert_enable) {
>> +gen_event(ibs, cmd[2], 0, evd1, evd2, evd3);
>> +}
>> +
>> +if (sens->deassert_states & mask & sens->deassert_enable) {
>> +gen_event(ibs, cmd[2], 1, evd1, evd2, evd3);
>> +}
>> +}
>> +break;
> 
> I moved this break statement above the brace before it to keep the
> indention consistent.  It just screwed with my brain too much :).
>
> I looked and there is nothing in the coding style about this, and I
> found this done in three different ways:
> 
>   case x: {  /* in vl.c */
>   
>   break;
>   }
>   case y: /* in thunk.c */
>   {
>
>   }
>   break;
>   case z: /* In vl.c */
>   {
>   
>   break;
>   }
> 
> Oddly enough, I didn't find anything about this in the Linux coding
> style document, either (I was curious).  One could argue, I suppose,
> that according to the "Block structure" section in the qemu style and
> the similar section in the Linux style that the first is correct,
> but then case statements violate the "Every indented statement is
> braced" statement in the qemu style.  This has always bugged me in
> C, sorry for the diatribe on this.

Thanks,

C. 

> 
> -corey
> 
>> +case SENSOR_GEN_EVENT_BMC:

Re: [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM

2019-11-22 Thread Beata Michalska

Hi,

On Fri, 15 Nov 2019 at 16:54, Igor Mammedov  wrote:
>
> On Mon, 11 Nov 2019 09:40:47 +0800
> Xiang Zheng  wrote:
>
> > From: Dongjiu Geng 
> >
> > Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
> > translates the host VA delivered by host to guest PA, then fills this PA
> > to guest APEI GHES memory, then notifies guest according to the SIGBUS
> > type.
> >
> > When guest accesses the poisoned memory, it will generate a Synchronous
> > External Abort(SEA). Then host kernel gets an APEI notification and calls
> > memory_failure() to unmapped the affected page in stage 2, finally
> > returns to guest.
> >
> > Guest continues to access the PG_hwpoison page, it will trap to KVM as
> > stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
> > Qemu, Qemu records this error address into guest APEI GHES memory and
> > notifes guest using Synchronous-External-Abort(SEA).
> >
> > In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
> > in which we can setup the type of exception and the syndrome information.
> > When switching to guest, the target vcpu will jump to the synchronous
> > external abort vector table entry.
> >
> > The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
> > ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
> > not valid and hold an UNKNOWN value. These values will be set to KVM
> > register structures through KVM_SET_ONE_REG IOCTL.
> >
> > Signed-off-by: Dongjiu Geng 
> > Signed-off-by: Xiang Zheng 
> > Reviewed-by: Michael S. Tsirkin 
> > ---
> >  hw/acpi/acpi_ghes.c | 297 
> >  include/hw/acpi/acpi_ghes.h |   4 +
> >  include/sysemu/kvm.h|   3 +-
> >  target/arm/cpu.h|   4 +
> >  target/arm/helper.c |   2 +-
> >  target/arm/internals.h  |   5 +-
> >  target/arm/kvm64.c  |  64 
> >  target/arm/tlb_helper.c |   2 +-
> >  target/i386/cpu.h   |   2 +
> >  9 files changed, 377 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c
> > index 42c00ff3d3..f5b54990c0 100644
> > --- a/hw/acpi/acpi_ghes.c
> > +++ b/hw/acpi/acpi_ghes.c
> > @@ -39,6 +39,34 @@
> >  /* The max size in bytes for one error block */
> >  #define ACPI_GHES_MAX_RAW_DATA_LENGTH   0x1000
> >
> > +/*
> > + * The total size of Generic Error Data Entry
> > + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> > + * Table 18-343 Generic Error Data Entry
> > + */
> > +#define ACPI_GHES_DATA_LENGTH   72
> > +
> > +/*
> > + * The memory section CPER size,
> > + * UEFI 2.6: N.2.5 Memory Error Section
> > + */
> maybe use one line comment
>
> > +#define ACPI_GHES_MEM_CPER_LENGTH   80
> > +
> > +/*
> > + * Masks for block_status flags
> > + */
> ditto
>
> > +#define ACPI_GEBS_UNCORRECTABLE 1
> > +
> > +/*
> > + * Values for error_severity field
> > + */
> ditto
>
> > +enum AcpiGenericErrorSeverity {
> > +ACPI_CPER_SEV_RECOVERABLE,
> > +ACPI_CPER_SEV_FATAL,
> > +ACPI_CPER_SEV_CORRECTED,
> > +ACPI_CPER_SEV_NONE,
> I'd assign values explicitly here
>   foo = x,
>   ...
>
> > +};
> > +
> >  /*
> >   * Now only support ARMv8 SEA notification type error source
> >   */
> > @@ -49,6 +77,16 @@
> >   */
> >  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
> >
> > +#define UUID_BE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)\
> > +{{{ ((a) >> 24) & 0xff, ((a) >> 16) & 0xff, ((a) >> 8) & 0xff, (a) & 
> > 0xff, \
> > +((b) >> 8) & 0xff, (b) & 0xff,   \
> > +((c) >> 8) & 0xff, (c) & 0xff,\
> > +(d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } } }
> > +
> > +#define UEFI_CPER_SEC_PLATFORM_MEM   \
> > +UUID_BE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
> > +0xED, 0x7C, 0x83, 0xB1)
> > +
> >  /*
> >   * | +--+ 0
> >   * | |Header|
> > @@ -77,6 +115,174 @@ typedef struct AcpiGhesState {
> >  uint64_t ghes_addr_le;
> >  } AcpiGhesState;
> >
> > +/*
> > + * Total size for Generic Error Status Block
> > + * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
> > + * Table 18-380 Generic Error Status Block
> > + */
> > +#define ACPI_GHES_GESB_SIZE 20
>
> > +/* The offset of Data Length in Generic Error Status Block */
> > +#define ACPI_GHES_GESB_DATA_LENGTH_OFFSET   12
>
> unused, drop it
>
> > +
> > +/*
> > + * Record the value of data length for each error status block to avoid 
> > getting
> > + * this value from guest.
> > + */
> > +static uint32_t acpi_ghes_data_length[ACPI_GHES_ERROR_SOURCE_COUNT];
> > +
> > +/*
> > + * Generic Error Data Entry
> > + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> > + */
> > +static void acpi_ghes_generic_error_data(GArray *table, QemuUUID 
> > section_type,
> > +uint32_t error_severity, uint16_t revision,
> > +uint8_t validation_bits, uint8_t flags,
> > +

Re: [PATCH] target/arm: Fix ISR_EL1 tracking when executing at EL2

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 3:16 PM, Peter Maydell wrote:

On Fri, 22 Nov 2019 at 13:59, Marc Zyngier  wrote:


The ARMv8 ARM states when executing at EL2, EL3 or Secure EL1,
ISR_EL1 shows the pending status of the physical IRQ, FIQ, or
SError interrupts.

Unfortunately, QEMU's implementation only considers the HCR_EL2
bits, and ignores the current exception level. This means a hypervisor
trying to look at its own interrupt state actually sees the guest
state, which is unexpected and breaks KVM as of Linux 5.3.

Instead, check for the running EL and return the physical bits
if not running in a virtualized context.

Fixes: 636540e9c40b
Reported-by: Quentin Perret 
Signed-off-by: Marc Zyngier 


Congratulations on your first QEMU patch :-)


:))

Re: [QUESTION] Usage of '0b' as a prefix for numerical constants?

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 4:10 PM, Eric Blake wrote:

On 11/22/19 8:56 AM, Aleksandar Markovic wrote:

Hello, all.

I am currently reviewing some code, and I see it uses '0b' as a prefix
of numerical constants, similar to these examples:

switch (((inst >> 22) & 0b111000) | ((inst >> 12) & 0b000111)) {

or

ARRAY_FIELD_DP32(s->regs, CRB_INTF_ID, RID, 0b);


Binary constants introduced by 0b are a gcc'ism, copied by clang, and 
thus usable in qemu if we want to (similar to our use of the ?: 
operator, the {} initializer, the ranged case 0 ... 7, 
__attribute__((cleanup)), ...).  But it is not standard C.




I remember a while ago, something stopped me from using '0b' as a
prefix in my own code (was it checkpatch.pl, or perhaps some statement
on coding style, or a compiler, or something else - I don't really
remember), so I didn't use it, and used '0x' (hexadecimal constant).

What is really the view of the community on usage of '0b'?


For small constants, 0b111 is just about as readable as 0x7.  But for 
large constants, I much prefer 0x7f over 0b111.


I use both. The choice depends on the datasheet I'm following. If 
reviewers look at the datasheet, I don't want them to do extra 
conversion just to verify the implementation.


So in my case it depends of the documentation used (usually restricted 
to ISA/hardware registers although).



Please C language standard and compiler experts, and also regular
participants like me, speak up.


If you want to provide a patch for coding standards (either admitting 
that yes we use the extension and here are some guidelines on using it, 
or declaring no new uses of it and maybe patching existing uses to 
switch to hex constants), then go for it.  Maybe wait for more opinions 
to come in to see which color more of the developers prefer for their 
bikeshed.

Re: [PATCH v2 4/5] MAINTAINERS: Adjust maintainership for R4000 systems

2019-11-22 Thread Philippe Mathieu-Daudé


On 11/22/19 3:14 PM, Aleksandar Markovic wrote:

On Fri, Nov 22, 2019 at 2:58 PM Philippe Mathieu-Daudé
 wrote:


Hi Aleksandar,

On 11/13/19 2:47 PM, Aleksandar Markovic wrote:

From: Aleksandar Markovic 

Change the maintainership for R4000 systems to improve its quality.

Acked-by: Aurelien Jarno 
Signed-off-by: Aleksandar Markovic 
---
   MAINTAINERS | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6afec32..ba9ca98 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -971,8 +971,9 @@ F: hw/mips/mips_mipssim.c
   F: hw/net/mipsnet.c

   R4000
-M: Aurelien Jarno 
-R: Aleksandar Rikalo 
+M: Hervé Poussineau 


Commit 0c10962a033 from Hervé was part of a bigger refactor series, so I
don't think he is interested.


+R: Aurelien Jarno 
+R: Philippe Mathieu-Daudé 
   S: Maintained
   F: hw/mips/mips_r4k.c


Now back to this board, I am having hard time to understand what it
models. IIUC it predates the Malta board, and was trying to model a
board able to run the first MIPS cpu when the port was added in 2005
(see commit 6af0bf9c7c3a).
The Malta board was added 1 year later (commit 5856de800df) and models a
real hardware.

As Aurelien acked to step down maintaining it, it seems the perfect
timing to start its deprecation process. I'll prepare a patch for 5.0
(unless someone is really using it and willing to maintain it).



Philippe, hi.

Herve told me a while ago that he does care about R4000 being
supported, as it is closely related to Jazz machines, so please
don't start any deprecation process.


I think what Hervé meant to say is he cares about the R4000 CPU 
(implementing the MIPSIII architecture). The Magnum and Pica boards 
indeed use a R4000 CPU. I also personally care about this CPU, and don't 
want it to disappear.


Here we are talking about the some Frankenstein board. QEMU aims to 
model real hardware, with the exception of the 'Virt' boards that have 
specifications. Here I can't find any. I am not against Hervé 
maintaining this file if he has some interest in it, but I think there 
are confusion and we are talking about 2 different topics.



Herve is the most familiar of all of us with R4000, and, for that
reason, my suggestion is to keep the patch as it is. Let me know
if you have any objections.

One alternative approach would be to merge "R4000" and
"Jazz" sections. But, let's leave it for future as an option,
if nobody objects.

Yours,
Aleksandar


Regards,

Phil.

1 2 3 >

1 - 100 of 225 matches

Mail list logo