date:20200528

Re: [PATCH v2 03/11] tests/acceptance: add base class record/replay kernel tests

2020-05-28 Thread Pavel Dovgalyuk




On 28.05.2020 11:28, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


On 27.05.2020 18:20, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


This patch adds a base for testing kernel boot recording and replaying.
Each test has the phase of recording and phase of replaying.
Virtual machines just boot the kernel and do not interact with
the network.
Structure and image links for the tests are borrowed from boot_linux_console.py
Testing controls the message pattern at the end of the kernel
boot for both record and replay modes. In replay mode QEMU is also
intended to finish the execution automatically.

Signed-off-by: Pavel Dovgalyuk 

diff --git a/MAINTAINERS b/MAINTAINERS
index 47ef3139e6..e9a9ce4f66 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2497,6 +2497,7 @@ F: net/filter-replay.c
   F: include/sysemu/replay.h
   F: docs/replay.txt
   F: stubs/replay.c
+F: tests/acceptance/replay_kernel.py
 IOVA Tree
   M: Peter Xu 
diff --git a/tests/acceptance/replay_kernel.py 
b/tests/acceptance/replay_kernel.py
new file mode 100644
index 00..b8b277ad2f
--- /dev/null
+++ b/tests/acceptance/replay_kernel.py
@@ -0,0 +1,57 @@
+# Record/replay test that boots a Linux kernel
+#
+# Copyright (c) 2020 ISP RAS
+#
+# Author:
+#  Pavel Dovgalyuk 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+import os
+import gzip

Do we actually use gzip in this test?

Removed that, thanks.


+
+from avocado_qemu import wait_for_console_pattern
+from avocado.utils import process
+from avocado.utils import archive
+from boot_linux_console import LinuxKernelUtils
+
+class ReplayKernel(LinuxKernelUtils):
+"""
+Boots a Linux kernel in record mode and checks that the console
+is operational and the kernel command line is properly passed
+from QEMU to the kernel.
+Then replays the same scenario and verifies, that QEMU correctly
+terminates.

Shouldn't we be doing more to verify the replay behaved the same as the
recorded session? What happens if things go wrong? Does QEMU barf out or
just deviate from the previous run?

We hardly can compare vCPU states during record and replay.

But in the most cases it is not needed. When control flow goes in the
wrong direction, it affects the interrupts and exceptions.

And interrupts and exceptions are the synchronization points in the
replay log. Therefore when the executions differ, QEMU replay just
hangs.

Maybe we should fix that and exit with a more definitive error? Hangs
are just plain ugly to debug because your first step has to be to start
poking around with a debugger.


Good point, I'll thinks about it.

Re: [PATCH v2 04/11] tests/acceptance: add kernel record/replay test for x86_64

2020-05-28 Thread Pavel Dovgalyuk




On 28.05.2020 16:26, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


On 27.05.2020 18:41, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


This patch adds a test for record/replay an execution of x86_64 machine.
Execution scenario includes simple kernel boot, which allows testing
basic hardware interaction in RR mode.

Signed-off-by: Pavel Dovgalyuk 
---
   0 files changed

diff --git a/tests/acceptance/replay_kernel.py 
b/tests/acceptance/replay_kernel.py
index b8b277ad2f..c7526f1aba 100644
--- a/tests/acceptance/replay_kernel.py
+++ b/tests/acceptance/replay_kernel.py
@@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
   True, shift, args)
   self.run_vm(kernel_path, kernel_command_line, console_pattern,
   False, shift, args)
+
+def test_x86_64_pc(self):
+"""
+:avocado: tags=arch:x86_64
+:avocado: tags=machine:pc
+"""
+kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+  '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
+  '/vmlinuz')
+kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
+kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+
+kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
+console_pattern = 'Kernel command line: %s' % kernel_command_line
+
+self.run_rr(kernel_path, kernel_command_line, console_pattern)

This test fails for me on the replay:

Have you applied latest RR patches?

I have the following on top of the acceptance patches:

a36c23042fe * review/record-replay-acceptance-v2 icount: fix shift=auto for 
record/replay
4ab2164c10b * replay: synchronize on every virtual timer callback
66104ce6e4b * replay: notify the main loop when there are no instructions


Please also try adding "replay: implement fair mutex"

Re: [PULL v3 00/11] bitmaps patches for 2020-05-26

2020-05-28 Thread Vladimir Sementsov-Ogievskiy


Hi!

Strange thing with your pull requests: I receive only small part of them.. I 
thought it's my problem of receiving part, but now I've checked that in mailing 
list archive there are same only two emails: 00/11 and 08/11 
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg08061.html

28.05.2020 21:18, Eric Blake wrote:

The following changes since commit a20ab81d22300cca80325c284f21eefee99aa740:

   Merge remote-tracking branch 
'remotes/huth-gitlab/tags/pull-request-2020-05-28' into staging (2020-05-28 
16:18:06 +0100)

are available in the Git repository at:

   https://repo.or.cz/qemu/ericb.git tags/pull-bitmaps-2020-05-26-v3

for you to fetch changes up to cf2d1203dcfc2bf964453d83a2302231ce77f2dc:

   iotests: Add test 291 to for qemu-img bitmap coverage (2020-05-28 13:16:30 
-0500)

v3: port sed expression to BSD sed
v2: fix iotest 190 to not be as sensitive to different sparseness of
qcow2 file on various filesystems, such as FreeBSD (sending only the
changed patch)


bitmaps patches for 2020-05-26

- fix non-blockdev migration of bitmaps when mirror job is in use
- add bitmap sizing to 'qemu-img measure'
- add 'qemu-img convert --bitmaps'


Eric Blake (5):
   iotests: Fix test 178
   qcow2: Expose bitmaps' size during measure
   qemu-img: Factor out code for merging bitmaps
   qemu-img: Add convert --bitmaps option
   iotests: Add test 291 to for qemu-img bitmap coverage

Vladimir Sementsov-Ogievskiy (6):
   migration: refactor init_dirty_bitmap_migration
   block/dirty-bitmap: add bdrv_has_named_bitmaps helper
   migration: fix bitmaps pre-blockdev migration with mirror job
   iotests: 194: test also migration of dirty bitmap
   migration: add_bitmaps_to_list: check disk name once
   migration: forbid bitmap migration by generated node-name

  docs/tools/qemu-img.rst  |  13 +++-
  qapi/block-core.json |  16 +++--
  block/qcow2.h|   2 +
  include/block/dirty-bitmap.h |   1 +
  block/crypto.c   |   2 +-
  block/dirty-bitmap.c |  13 
  block/qcow2-bitmap.c |  36 ++
  block/qcow2.c|  14 +++-
  block/raw-format.c   |   2 +-
  migration/block-dirty-bitmap.c   | 142 ---
  qemu-img.c   | 107 -
  qemu-img-cmds.hx |   4 +-
  tests/qemu-iotests/178.out.qcow2 |  18 -
  tests/qemu-iotests/178.out.raw   |   2 +-
  tests/qemu-iotests/190   |  47 -
  tests/qemu-iotests/190.out   |  27 +++-
  tests/qemu-iotests/194   |  14 ++--
  tests/qemu-iotests/194.out   |   6 ++
  tests/qemu-iotests/291   | 112 ++
  tests/qemu-iotests/291.out   |  80 ++
  tests/qemu-iotests/group |   1 +
  21 files changed, 582 insertions(+), 77 deletions(-)
  create mode 100755 tests/qemu-iotests/291
  create mode 100644 tests/qemu-iotests/291.out




--
Best regards,
Vladimir

Re: [PATCH] spapr: Fix typos in comments and macro indentation

2020-05-28 Thread Cédric Le Goater

All QEMU patches should be sent to the qemu-devel mailing list also 
and to David as he is the PPC maintainer.

On 5/29/20 2:04 AM, Gustavo Romero wrote:
> This commit fixes typos in spapr_vio_reg_to_irq() comments and a macro
> indentation.
> 
> Signed-off-by: Gustavo Romero 

Acked-by: Cédric Le Goater 

Thanks,

C. 


> ---
>  hw/ppc/spapr_vio.c | 6 +++---
>  include/hw/ppc/xive_regs.h | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 0b085ea..741fdbf 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -420,7 +420,7 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
>  }
>  
>  /*
> - * The register property of a VIO device is defined in livirt using
> + * The register property of a VIO device is defined in libvirt using
>   * 0x1000 as a base register number plus a 0x1000 increment. For the
>   * VIO tty device, the base number is changed to 0x3000. QEMU uses
>   * a base register number of 0x7100 and then a simple increment.
> @@ -450,7 +450,7 @@ static inline uint32_t spapr_vio_reg_to_irq(uint32_t reg)
>  
>  } else if (reg >= 0x3000) {
>  /*
> - * VIO tty devices register values, when allocated by livirt,
> + * VIO tty devices register values, when allocated by libvirt,
>   * are mapped in range [0xf0 - 0xff], gives us a maximum of 16
>   * vtys.
>   */
> @@ -459,7 +459,7 @@ static inline uint32_t spapr_vio_reg_to_irq(uint32_t reg)
>  } else {
>  /*
>   * Other VIO devices register values, when allocated by
> - * livirt, should be mapped in range [0x00 - 0xef]. Conflicts
> + * libvirt, should be mapped in range [0x00 - 0xef]. Conflicts
>   * will be detected when IRQ is claimed.
>   */
>  irq = (reg >> 12) & 0xff;
> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
> index 09f2436..7879692 100644
> --- a/include/hw/ppc/xive_regs.h
> +++ b/include/hw/ppc/xive_regs.h
> @@ -71,7 +71,7 @@
>   * QW word 2 contains the valid bit at the top and other fields
>   * depending on the QW.
>   */
> -#define TM_WORD20x8
> +#define   TM_WORD2  0x8
>  #define   TM_QW0W2_VU   PPC_BIT32(0)
>  #define   TM_QW0W2_LOGIC_SERV   PPC_BITMASK32(1, 31) /* XX 2,31 ? */
>  #define   TM_QW1W2_VO   PPC_BIT32(0)
>

Re: [PATCH v4 1/4] hw/riscv: spike: Remove deprecated ISA specific machines

2020-05-28 Thread Thomas Huth

On 29/05/2020 00.16, Alistair Francis wrote:
> The ISA specific Spike machines have been deprecated in QEMU since 4.1,
> let's finally remove them.
> 
> Signed-off-by: Alistair Francis 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Bin Meng 
> ---
>  docs/system/deprecated.rst |  17 +--
>  include/hw/riscv/spike.h   |   6 +-
>  hw/riscv/spike.c   | 217 -
>  3 files changed, 12 insertions(+), 228 deletions(-)

Reviewed-by: Thomas Huth

Re: [PATCH v2 1/1] tests/qtest/fuzz: Add faster virtio tests

2020-05-28 Thread Alexander Bulekov

On 200528 1853, Philippe Mathieu-Daudé wrote:
> We don't need to serialize over QTest chardev when we can
> directly access the MMIO address space via the first
> registered CPU view.
> Rename the currents tests as $TEST-qtest, add add faster
> tests that don't use the qtest chardev.
> 
> virtio-net-socket gets ~50% performance improvement.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Alexander Bulekov 

Thanks for fixing the spaces in the descriptions, too.

> ---
>  tests/qtest/fuzz/virtio_net_fuzz.c  | 42 ---
>  tests/qtest/fuzz/virtio_scsi_fuzz.c | 53 +
>  2 files changed, 84 insertions(+), 11 deletions(-)
> 
> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
> b/tests/qtest/fuzz/virtio_net_fuzz.c
> index d08a47e278..7a39cfbb75 100644
> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
> @@ -19,6 +19,8 @@
>  #include "fork_fuzz.h"
>  #include "qos_fuzz.h"
>  
> +#include "exec/address-spaces.h"
> +#include "hw/core/cpu.h"
>  
>  #define QVIRTIO_NET_TIMEOUT_US (30 * 1000 * 1000)
>  #define QVIRTIO_RX_VQ 0
> @@ -29,7 +31,9 @@ static int sockfds[2];
>  static bool sockfds_initialized;
>  
>  static void virtio_net_fuzz_multi(QTestState *s,
> -const unsigned char *Data, size_t Size, bool check_used)
> +  const unsigned char *Data, size_t Size,
> +  bool check_used, bool use_qtest_chardev)
> +
>  {
>  typedef struct vq_action {
>  uint8_t queue;
> @@ -69,8 +73,13 @@ static void virtio_net_fuzz_multi(QTestState *s,
>   * If checking used ring, ensure that the fuzzer doesn't trigger
>   * trivial asserion failure on zero-zied buffer
>   */
> -qtest_memwrite(s, req_addr, Data, vqa.length);
> -
> +if (use_qtest_chardev) {
> +qtest_memwrite(s, req_addr, Data, vqa.length);
> +} else {
> +address_space_write(first_cpu->as, req_addr,
> + MEMTXATTRS_UNSPECIFIED,
> + , vqa.length);
> +}
>  
>  free_head = qvirtqueue_add(s, q, req_addr, vqa.length,
>  vqa.write, vqa.next);
> @@ -118,7 +127,20 @@ static void virtio_net_fork_fuzz(QTestState *s,
>  const unsigned char *Data, size_t Size)
>  {
>  if (fork() == 0) {
> -virtio_net_fuzz_multi(s, Data, Size, false);
> +virtio_net_fuzz_multi(s, Data, Size, false, false);
> +flush_events(s);
> +_Exit(0);
> +} else {
> +wait(NULL);
> +}
> +}
> +
> +static void virtio_net_fork_fuzz_qtest(QTestState *s,
> +   const unsigned char *Data,
> +   size_t Size)
> +{
> +if (fork() == 0) {
> +virtio_net_fuzz_multi(s, Data, Size, false, true);
>  flush_events(s);
>  _Exit(0);
>  } else {
> @@ -130,7 +152,7 @@ static void virtio_net_fork_fuzz_check_used(QTestState *s,
>  const unsigned char *Data, size_t Size)
>  {
>  if (fork() == 0) {
> -virtio_net_fuzz_multi(s, Data, Size, true);
> +virtio_net_fuzz_multi(s, Data, Size, true, false);
>  flush_events(s);
>  _Exit(0);
>  } else {
> @@ -173,6 +195,16 @@ static void register_virtio_net_fuzz_targets(void)
>  &(QOSGraphTestOptions){.before = virtio_net_test_setup_socket}
>  );
>  
> +fuzz_add_qos_target(&(FuzzTarget){
> +.name = "virtio-net-socket-qtest",
> +.description = "Fuzz the virtio-net virtual queues. Fuzz 
> incoming "
> +"traffic using the socket backend (over a qtest chardev)",
> +.pre_fuzz = _net_pre_fuzz,
> +.fuzz = virtio_net_fork_fuzz_qtest,},
> +"virtio-net",
> +&(QOSGraphTestOptions){.before = virtio_net_test_setup_socket}
> +);
> +
>  fuzz_add_qos_target(&(FuzzTarget){
>  .name = "virtio-net-socket-check-used",
>  .description = "Fuzz the virtio-net virtual queues. Wait for the 
> "
> diff --git a/tests/qtest/fuzz/virtio_scsi_fuzz.c 
> b/tests/qtest/fuzz/virtio_scsi_fuzz.c
> index 3b95247f12..27b63b2e32 100644
> --- a/tests/qtest/fuzz/virtio_scsi_fuzz.c
> +++ b/tests/qtest/fuzz/virtio_scsi_fuzz.c
> @@ -23,6 +23,9 @@
>  #include "fork_fuzz.h"
>  #include "qos_fuzz.h"
>  
> +#include "exec/address-spaces.h"
> +#include "hw/core/cpu.h"
> +
>  #define PCI_SLOT0x02
>  #define PCI_FN  0x00
>  #define QVIRTIO_SCSI_TIMEOUT_US (1 * 1000 * 1000)
> @@ -63,7 +66,8 @@ static QVirtioSCSIQueues *qvirtio_scsi_init(QVirtioDevice 
> *dev, uint64_t mask)
>  }
>  
>  static void virtio_scsi_fuzz(QTestState *s, QVirtioSCSIQueues* queues,
> -const unsigned char *Data, size_t Size)
> + const unsigned char *Data, size_t Size,
> +

Re: [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices

2020-05-28 Thread Yan Zhao

On Thu, May 28, 2020 at 04:59:06PM -0600, Alex Williamson wrote:
> On Wed, 27 May 2020 09:48:22 +0100
> "Dr. David Alan Gilbert"  wrote:
> > * Yan Zhao (yan.y.z...@intel.com) wrote:
> > > BTW, for viommu, the downtime data is as below. under the same network
> > > condition and guest memory size, and no running dirty data/memory produced
> > > by device.
> > > (1) viommu off
> > > single-round dirty query: downtime ~100ms   
> > 
> > Fine.
> > 
> > > (2) viommu on
> > > single-round dirty query: downtime 58s   
> > 
> > Youch.
> 
> Double Youch!  But we believe this is because we're getting the dirty
> bitmap one IOMMU leaf page at a time, right?  We've enable the kernel
> to get a dirty bitmap across multiple mappings, but QEMU isn't yet
> taking advantage of it.  Do I have this correct?  Thanks,
>
Yes, I think so, but I haven't looked into it yet.

Thanks
Yan

[Bug 1881231] [NEW] colo: Can not recover colo after svm failover twice

2020-05-28 Thread ye.zou

Public bug reported:

Hi Expert,
x-blockdev-change met some error, during testing colo

Host os:
CentOS Linux release 7.6.1810 (Core)

Reproduce steps:
1. create colo vm following 
https://github.com/qemu/qemu/blob/master/docs/COLO-FT.txt
2. kill secondary vm and remove the nbd child from the quorum to wait for 
recover
  type those commands on primary vm console:
  { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
  { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del 
replication0'}}
  { 'execute': 'x-colo-lost-heartbeat'}
3. recover colo
4. kill secondary vm again after recover colo and type same commands as step 2:
  { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
  { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del 
replication0'}}
  { 'execute': 'x-colo-lost-heartbeat'}
  but the first command got error
  { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
{"error": {"class": "GenericError", "desc": "Node 'colo-disk0' does not have 
child 'children.1'"}}

according to https://www.qemu.org/docs/master/qemu-qmp-ref.html
Command: x-blockdev-change
Dynamically reconfigure the block driver state graph. It can be used to add, 
remove, insert or replace a graph node. Currently only the Quorum driver 
implements this feature to add or remove its child. This is useful to fix a 
broken quorum child.

It seems x-blockdev-change not worked as expected.

Thanks.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1881231

Title:
  colo: Can not  recover colo after svm failover twice

Status in QEMU:
  New

Bug description:
  Hi Expert,
  x-blockdev-change met some error, during testing colo

  Host os:
  CentOS Linux release 7.6.1810 (Core)

  Reproduce steps:
  1. create colo vm following 
https://github.com/qemu/qemu/blob/master/docs/COLO-FT.txt
  2. kill secondary vm and remove the nbd child from the quorum to wait for 
recover
type those commands on primary vm console:
{ 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
{ 'execute': 'human-monitor-command','arguments': {'command-line': 
'drive_del replication0'}}
{ 'execute': 'x-colo-lost-heartbeat'}
  3. recover colo
  4. kill secondary vm again after recover colo and type same commands as step 
2:
{ 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
{ 'execute': 'human-monitor-command','arguments': {'command-line': 
'drive_del replication0'}}
{ 'execute': 'x-colo-lost-heartbeat'}
but the first command got error
{ 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 
'child': 'children.1'}}
  {"error": {"class": "GenericError", "desc": "Node 'colo-disk0' does not have 
child 'children.1'"}}

  according to https://www.qemu.org/docs/master/qemu-qmp-ref.html
  Command: x-blockdev-change
  Dynamically reconfigure the block driver state graph. It can be used to add, 
remove, insert or replace a graph node. Currently only the Quorum driver 
implements this feature to add or remove its child. This is useful to fix a 
broken quorum child.

  It seems x-blockdev-change not worked as expected.

  Thanks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1881231/+subscriptions

Re: [PATCH 0/2] Add support for SEV Launch Secret Injection

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528205114.42078-1-to...@linux.vnet.ibm.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  GEN ui/input-keymap-xorgxquartz-to-qcode.c
In file included from /tmp/qemu-test/src/qapi/qapi-schema.json:85:
/tmp/qemu-test/src/qapi/misc-target.json:213:9: stray 'GPA'
make: *** [Makefile:666: qapi-gen-timestamp] Error 1
make: *** Waiting for unfinished jobs
  CC  /tmp/qemu-test/build/slirp/src/ip6_icmp.o
  CC  /tmp/qemu-test/build/slirp/src/slirp.o
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=cb62fe08a707401d8f3632cb951681ac', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-h6j9yyx9/src/docker-src.2020-05-28-23.37.24.24496:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=cb62fe08a707401d8f3632cb951681ac
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-h6j9yyx9/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real2m8.174s
user0m8.497s


The full log is available at
http://patchew.org/logs/20200528205114.42078-1-to...@linux.vnet.ibm.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH 0/2] Add support for SEV Launch Secret Injection

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528205114.42078-1-to...@linux.vnet.ibm.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200528205114.42078-1-to...@linux.vnet.ibm.com
Subject: [PATCH 0/2] Add support for SEV Launch Secret Injection
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
fefbf6f sev: scan guest ROM for launch secret address
94d7e7b sev: add sev-inject-launch-secret

=== OUTPUT BEGIN ===
1/2 Checking commit 94d7e7bc7c3c (sev: add sev-inject-launch-secret)
ERROR: code indent should never use tabs
#26: FILE: include/sysemu/sev.h:22:
+^I^I uint64_t gpa);$

ERROR: trailing whitespace
#45: FILE: qapi/misc-target.json:213:
+GPA provided here will be ignored if guest ROM specifies $

ERROR: suspect code indent for conditional statements (4, 6)
#72: FILE: target/i386/monitor.c:744:
+if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
+  error_setg(errp, "SEV inject secret failed");

ERROR: space required after that ',' (ctx:VxV)
#72: FILE: target/i386/monitor.c:744:
+if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
^

ERROR: space required after that ',' (ctx:VxV)
#72: FILE: target/i386/monitor.c:744:
+if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
   ^

ERROR: braces {} are necessary for all arms of this statement
#72: FILE: target/i386/monitor.c:744:
+if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
[...]

ERROR: code indent should never use tabs
#84: FILE: target/i386/sev-stub.c:52:
+^I^I uint64_t gpa)$

ERROR: code indent should never use tabs
#86: FILE: target/i386/sev-stub.c:54:
+^Ireturn 1;$

ERROR: code indent should never use tabs
#136: FILE: target/i386/sev.c:776:
+^Ierror_report("Not in correct state. %x",sev_state->state);$

ERROR: space required after that ',' (ctx:VxV)
#136: FILE: target/i386/sev.c:776:
+   error_report("Not in correct state. %x",sev_state->state);
   ^

ERROR: code indent should never use tabs
#137: FILE: target/i386/sev.c:777:
+^Ireturn 1;$

ERROR: space required after that ',' (ctx:VxV)
#170: FILE: target/i386/sev.c:810:
+ret = sev_ioctl(sev_state->sev_fd,KVM_SEV_LAUNCH_SECRET, input, );
  ^

ERROR: do not use C99 // comments
#207: FILE: tests/qtest/qmp-cmd-test.c:96:
+// "query-sev-launch-measure",

ERROR: do not use C99 // comments
#211: FILE: tests/qtest/qmp-cmd-test.c:98:
+// "query-sev",

ERROR: do not use C99 // comments
#212: FILE: tests/qtest/qmp-cmd-test.c:99:
+// "query-sev-capabilities",

total: 15 errors, 0 warnings, 163 lines checked

Patch 1/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

2/2 Checking commit fefbf6f8855c (sev: scan guest ROM for launch secret address)
ERROR: space required after that ',' (ctx:VxO)
#43: FILE: target/i386/sev.c:741:
+qemu_uuid_parse(SEV_ROM_SECRET_GUID,_table_guid);
^

ERROR: space required before that '&' (ctx:OxV)
#43: FILE: target/i386/sev.c:741:
+qemu_uuid_parse(SEV_ROM_SECRET_GUID,_table_guid);
 ^

ERROR: space required before the open parenthesis '('
#47: FILE: target/i386/sev.c:745:
+while(offset > 0) {

ERROR: space required before the open brace '{'
#49: FILE: target/i386/sev.c:747:
+if(qemu_uuid_is_equal(_table_guid, (QemuUUID *) secret_table)){

ERROR: space required before the open parenthesis '('
#49: FILE: target/i386/sev.c:747:
+if(qemu_uuid_is_equal(_table_guid, (QemuUUID *) secret_table)){

ERROR: space required before the open parenthesis '('
#64: FILE: target/i386/sev.c:762:
+if(!sev_state->secret_gpa) {

ERROR: code indent should never use tabs
#66: FILE: target/i386/sev.c:764:
+^I}$

ERROR: space required after that ',' (ctx:VxV)
#76: FILE: target/i386/sev.c:803:
+error_report("Not in correct state. %x",sev_state->state);
^

ERROR: space required before the open parenthesis '('
#85: FILE: target/i386/sev.c:819:
+if(sev_state->secret_gpa)

ERROR: braces {} are necessary for all arms of this statement
#85: FILE: target/i386/sev.c:819:
+if(sev_state->secret_gpa)
[...]

total: 10 errors, 0 warnings, 104 lines checked

Patch 2/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test

Re: [PATCH 0/2] Add support for SEV Launch Secret Injection

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528205114.42078-1-to...@linux.vnet.ibm.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  GEN scsi/trace.h
  GEN audio/trace.h
  CC  /tmp/qemu-test/build/slirp/src/tcp_output.o
make: *** [Makefile:666: qapi-gen-timestamp] Error 1
make: *** Waiting for unfinished jobs
  CC  /tmp/qemu-test/build/slirp/src/ndp_table.o
  CC  /tmp/qemu-test/build/slirp/src/bootp.o
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=93d79e62908146289998366473c102a3', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 
'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 
'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', 
'-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-bnxinu3b/src/docker-src.2020-05-28-23.32.39.19459:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=93d79e62908146289998366473c102a3
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-bnxinu3b/src'
make: *** [docker-run-test-debug@fedora] Error 2

real3m13.106s
user0m8.085s


The full log is available at
http://patchew.org/logs/20200528205114.42078-1-to...@linux.vnet.ibm.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH 0/2] Add support for SEV Launch Secret Injection

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528205114.42078-1-to...@linux.vnet.ibm.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  GEN ui/input-keymap-qnum-to-qcode.c
In file included from /tmp/qemu-test/src/qapi/qapi-schema.json:85:
/tmp/qemu-test/src/qapi/misc-target.json:213:9: stray 'GPA'
make: *** [qapi-gen-timestamp] Error 1
make: *** Waiting for unfinished jobs
  CC  /tmp/qemu-test/build/slirp/src/slirp.o
  CC  /tmp/qemu-test/build/slirp/src/vmstate.o
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=6e1594b856a84baabe3c89fab85fce17', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-yd1xv0uz/src/docker-src.2020-05-28-23.30.04.14959:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=6e1594b856a84baabe3c89fab85fce17
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-yd1xv0uz/src'
make: *** [docker-run-test-quick@centos7] Error 2

real1m59.216s
user0m7.852s


The full log is available at
http://patchew.org/logs/20200528205114.42078-1-to...@linux.vnet.ibm.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH] virtio-pci: fix queue_enable write

2020-05-28 Thread Jason Wang

Spec said: The driver uses this to selectively prevent the device from
executing requests from this virtqueue. 1 - enabled; 0 - disabled.

Though write 0 to queue_enable is forbidden by the sepc, we should not
assume that the value is 1.

Fix this by ignoring the write value other than 1.

Cc: Michael S. Tsirkin 
Cc: Stefan Hajnoczi 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-pci.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index d028c17c24..b3558eeaee 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1273,16 +1273,18 @@ static void virtio_pci_common_write(void *opaque, 
hwaddr addr,
 virtio_queue_set_vector(vdev, vdev->queue_sel, val);
 break;
 case VIRTIO_PCI_COMMON_Q_ENABLE:
-virtio_queue_set_num(vdev, vdev->queue_sel,
- proxy->vqs[vdev->queue_sel].num);
-virtio_queue_set_rings(vdev, vdev->queue_sel,
+if (val == 1) {
+virtio_queue_set_num(vdev, vdev->queue_sel,
+ proxy->vqs[vdev->queue_sel].num);
+virtio_queue_set_rings(vdev, vdev->queue_sel,
((uint64_t)proxy->vqs[vdev->queue_sel].desc[1]) << 32 |
proxy->vqs[vdev->queue_sel].desc[0],
((uint64_t)proxy->vqs[vdev->queue_sel].avail[1]) << 32 |
proxy->vqs[vdev->queue_sel].avail[0],
((uint64_t)proxy->vqs[vdev->queue_sel].used[1]) << 32 |
proxy->vqs[vdev->queue_sel].used[0]);
-proxy->vqs[vdev->queue_sel].enabled = 1;
+proxy->vqs[vdev->queue_sel].enabled = 1;
+}
 break;
 case VIRTIO_PCI_COMMON_Q_DESCLO:
 proxy->vqs[vdev->queue_sel].desc[0] = val;
-- 
2.20.1

Re: [PATCH 00/13] i386: hvf: Remove HVFX86EmulatorState

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528193758.51454-1-r.bolsha...@yadro.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200528193758.51454-1-r.bolsha...@yadro.com
Subject: [PATCH 00/13] i386: hvf: Remove HVFX86EmulatorState
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
babfd75 i386: hvf: Drop HVFX86EmulatorState
57557be i386: hvf: Move mmio_buf into CPUX86State
53b9354 i386: hvf: Move lazy_flags into CPUX86State
ba6c821 i386: hvf: Drop regs in HVFX86EmulatorState
70b2839 i386: hvf: Drop copy of RFLAGS defines
aef7278 i386: hvf: Drop rflags from HVFX86EmulatorState
3aa57aa i386: hvf: Drop fetch_rip from HVFX86EmulatorState
44a94ed i386: hvf: Use IP from CPUX86State
ef6fe79 i386: hvf: Use ins_len to advance IP
ec88b12 i386: hvf: Drop unused variable
de8d999 i386: hvf: Clean stray includes in sysemu
ad061bc i386: hvf: Drop useless declarations in sysemu
0da6fba i386: hvf: Move HVFState definition into hvf

=== OUTPUT BEGIN ===
1/13 Checking commit 0da6fbafda5f (i386: hvf: Move HVFState definition into hvf)
2/13 Checking commit ad061bc7f025 (i386: hvf: Drop useless declarations in 
sysemu)
3/13 Checking commit de8d9997e911 (i386: hvf: Clean stray includes in sysemu)
4/13 Checking commit ec88b12c4ae7 (i386: hvf: Drop unused variable)
5/13 Checking commit ef6fe796978e (i386: hvf: Use ins_len to advance IP)
6/13 Checking commit 44a94ed21d06 (i386: hvf: Use IP from CPUX86State)
ERROR: unnecessary whitespace before a quoted newline
#444: FILE: target/i386/hvf/x86_emu.c:1470:
+printf("Unimplemented handler (%llx) for %d (%x %x) \n", env->eip,

total: 1 errors, 0 warnings, 403 lines checked

Patch 6/13 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

7/13 Checking commit 3aa57aab9271 (i386: hvf: Drop fetch_rip from 
HVFX86EmulatorState)
8/13 Checking commit aef72785da68 (i386: hvf: Drop rflags from 
HVFX86EmulatorState)
9/13 Checking commit 70b2839d8d2e (i386: hvf: Drop copy of RFLAGS defines)
10/13 Checking commit ba6c821f8a7e (i386: hvf: Drop regs in HVFX86EmulatorState)
11/13 Checking commit 53b93542 (i386: hvf: Move lazy_flags into CPUX86State)
12/13 Checking commit 57557be2d13c (i386: hvf: Move mmio_buf into CPUX86State)
13/13 Checking commit babfd7578724 (i386: hvf: Drop HVFX86EmulatorState)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200528193758.51454-1-r.bolsha...@yadro.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH RFC 26/32] python//machine.py: use qmp.command

2020-05-28 Thread John Snow



[...]

>  
> -def qmp(self, cmd, conv_keys=True, **args):
> -"""
> -Invoke a QMP command and return the response dict
> -"""
> +@classmethod
> +def _qmp_args(cls, _conv_keys: bool = True, **args: Any) -> Dict[str, 
> Any]:
>  qmp_args = dict()
>  for key, value in args.items():
> -if conv_keys:
> +if _conv_keys:
>  qmp_args[key.replace('_', '-')] = value
>  else:
>  qmp_args[key] = value
> +return qmp_args
>  
> +def qmp(self, cmd: str,
> +conv_keys: bool = True,
> +**args: Any) -> QMPMessage:

This creates an interesting problem with iotests 297:


-Success: no issues found in 1 source file
+iotests.py:563: error: Argument 2 to "qmp" of "QEMUMachine" has
incompatible type "**Dict[str, str]"; expected "bool"
+Found 1 error in 1 file (checked 1 source file)


def hmp(self, command_line: str, use_log: bool = False) -> QMPResponse:
cmd = 'human-monitor-command'
kwargs = {'command-line': command_line}
if use_log:
return self.qmp_log(cmd, **kwargs)
else:
return self.qmp(cmd, **kwargs)

It seems like mypy is unable to understand that we are passing keyword
arguments, and instead believes we're passing something to the conv_keys
parameter.

(Is this a bug...?)

Even amending the function signature to indicate that conv_keys should
only ever appear as a keyword argument doesn't seem to help.

I'll have to think about a nice way to fix this; removing conv_keys out
of the argument namespace seems like the best approach.

qmp(cmd, foo=bar, hello=world)
qmp(cmd, **conv_keys(foo=bar, hello=world))

...but now this function looks really annoying to call.

Uh, I'll play around with this, but let me know if you have any cool ideas.

--js

[Bug 1877418] Re: qemu-nbd freezes access to VDI file

2020-05-28 Thread Bump

I thought there were qemu-img for that. Since qemu-nbd allows mounting
images a rw block devices, it's logical to think that you can use it for
that purpose. Will try to reproduce again the issue in case it was a
kernel problem instead of qemu-nbd.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877418

Title:
  qemu-nbd freezes access to VDI file

Status in QEMU:
  New
Status in btrfs-progs package in Ubuntu:
  New

Bug description:
  Mounted Oracle Virtualbox .vdi drive (dynamically allocated), which has 
GTP+BTRFS:
  sudo modprobe nbd max_part=16
  sudo qemu-nbd -c /dev/nbd0 /storage/btrfs.vdi
  mount /dev/nbd0p1 /mydata/

  Then I am operating on the btrfs filesystem and suddenly it freezes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877418/+subscriptions

[Bug 1872790] Re: empty qcow2

2020-05-28 Thread John Snow

It sounds like maybe these disks have been partitioned in a format that
only Windows understands. Can you tell me what the windows disk manager
claims the partition table format to be?

If you still think that maybe there's a QEMU bug, please give more
details:

- host kernel version

- qemu version

- qemu command line

- how were these qcow2 files created?

- What version of qcow2 file does `qemu-img info` say they are?

- What version of windows? (10?)

- Can you name one of the third party disk managers so we can try to
reproduce it?


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1872790

Title:
  empty qcow2

Status in QEMU:
  Incomplete

Bug description:
  I plugged multiple qcow2 to a Windows guest. On the Windows disk
  manager all disks are listed perfectly, with their data, their real
  space, I even can explore all files on the Explorer, all cool

  On third party disk manager (all of them), I only have the C:\ HDD who
  act normally, all the other plugged qcow2 are seen as fully
  unallocated, so I can't manipulate them

  I want to move some partitions, create others, but on Windows disk
  manager I can't extend or create partition and on third party I didn't
  see the partitions at all

  Even guestfs doesn't recognize any partition table `libguestfs: error:
  inspect_os: /dev/sda: not a partitioned device`

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1872790/+subscriptions

Re: [PATCH v3 0/3] account for NVDIMM nodes during SRAT generation

2020-05-28 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528162011.16258-1-vishal.l.ve...@intel.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200528162011.16258-1-vishal.l.ve...@intel.com
Subject: [PATCH v3 0/3] account for NVDIMM nodes during SRAT generation
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
7e9fa62 tests/acpi: update expected SRAT files
e8c3427 hw/acpi-build: account for NVDIMM numa nodes in SRAT
7598dc9 diffs-allowed: add the SRAT AML to diffs-allowed

=== OUTPUT BEGIN ===
1/3 Checking commit 7598dc9bc984 (diffs-allowed: add the SRAT AML to 
diffs-allowed)
2/3 Checking commit e8c342740610 (hw/acpi-build: account for NVDIMM numa nodes 
in SRAT)
3/3 Checking commit 7e9fa62e9d26 (tests/acpi: update expected SRAT files)
ERROR: Do not add expected files together with tests, follow instructions in 
tests/qtest/bios-tables-test.c: both tests/data/acpi/q35/SRAT.dimmpxm and 
tests/qtest/bios-tables-test-allowed-diff.h found

ERROR: Do not add expected files together with tests, follow instructions in 
tests/qtest/bios-tables-test.c: both tests/data/acpi/q35/SRAT.dimmpxm and 
tests/qtest/bios-tables-test-allowed-diff.h found

total: 2 errors, 0 warnings, 1 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200528162011.16258-1-vishal.l.ve...@intel.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices

2020-05-28 Thread Alex Williamson

On Wed, 27 May 2020 09:48:22 +0100
"Dr. David Alan Gilbert"  wrote:
> * Yan Zhao (yan.y.z...@intel.com) wrote:
> > BTW, for viommu, the downtime data is as below. under the same network
> > condition and guest memory size, and no running dirty data/memory produced
> > by device.
> > (1) viommu off
> > single-round dirty query: downtime ~100ms   
> 
> Fine.
> 
> > (2) viommu on
> > single-round dirty query: downtime 58s   
> 
> Youch.

Double Youch!  But we believe this is because we're getting the dirty
bitmap one IOMMU leaf page at a time, right?  We've enable the kernel
to get a dirty bitmap across multiple mappings, but QEMU isn't yet
taking advantage of it.  Do I have this correct?  Thanks,

Alex

[PATCH v8 8/8] block: lift blocksize property limit to 2 MiB

2020-05-28 Thread Roman Kagan

Logical and physical block sizes in QEMU are limited to 32 KiB.

This appears unnecessarily tight, and we've seen bigger block sizes
handy at times.

Lift the limitation up to 2 MiB which appears to be good enough for
everybody, and matches the qcow2 cluster size limit.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
 hw/core/qdev-properties.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 63d48db70c..ead35d7ffd 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -784,9 +784,12 @@ const PropertyInfo qdev_prop_size32 = {
 /* lower limit is sector size */
 #define MIN_BLOCK_SIZE  512
 #define MIN_BLOCK_SIZE_STR  "512 B"
-/* upper limit is the max power of 2 that fits in uint16_t */
-#define MAX_BLOCK_SIZE  (32 * KiB)
-#define MAX_BLOCK_SIZE_STR  "32 KiB"
+/*
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
+ * matches qcow2 cluster size limit
+ */
+#define MAX_BLOCK_SIZE  (2 * MiB)
+#define MAX_BLOCK_SIZE_STR  "2 MiB"
 
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
-- 
2.26.2

[PATCH v8 6/8] block: make BlockConf size props 32bit and accept size suffixes

2020-05-28 Thread Roman Kagan

Convert all size-related properties in BlockConf to 32bit.  This will
accommodate bigger block sizes (in a followup patch).  This also allows
to make them all accept size suffixes, either via DEFINE_PROP_BLOCKSIZE
or via DEFINE_PROP_SIZE32.

Also, since min_io_size is exposed to the guest by scsi and virtio-blk
devices as an uint16_t in units of logical blocks, introduce an
additional check in blkconf_blocksizes to prevent its silent truncation.

Signed-off-by: Roman Kagan 
---
v7 -> v8:
- replace stringify with %u in the error message [Eric]
- fix wording in the log [Eric]

 include/hw/block/block.h | 12 ++--
 include/hw/qdev-properties.h |  2 +-
 hw/block/block.c | 10 ++
 hw/core/qdev-properties.c|  4 ++--
 4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 784953a237..1e8b6253dd 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -18,9 +18,9 @@
 
 typedef struct BlockConf {
 BlockBackend *blk;
-uint16_t physical_block_size;
-uint16_t logical_block_size;
-uint16_t min_io_size;
+uint32_t physical_block_size;
+uint32_t logical_block_size;
+uint32_t min_io_size;
 uint32_t opt_io_size;
 int32_t bootindex;
 uint32_t discard_granularity;
@@ -51,9 +51,9 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
   _conf.logical_block_size),\
 DEFINE_PROP_BLOCKSIZE("physical_block_size", _state,\
   _conf.physical_block_size),   \
-DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),\
-DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),\
-DEFINE_PROP_UINT32("discard_granularity", _state,   \
+DEFINE_PROP_SIZE32("min_io_size", _state, _conf.min_io_size, 0),\
+DEFINE_PROP_SIZE32("opt_io_size", _state, _conf.opt_io_size, 0),\
+DEFINE_PROP_SIZE32("discard_granularity", _state,   \
_conf.discard_granularity, -1),  \
 DEFINE_PROP_ON_OFF_AUTO("write-cache", _state, _conf.wce,   \
 ON_OFF_AUTO_AUTO),  \
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index c03eadfad6..5252bb6b1a 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -200,7 +200,7 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
 #define DEFINE_PROP_SIZE32(_n, _s, _f, _d)   \
 DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size32, uint32_t)
 #define DEFINE_PROP_BLOCKSIZE(_n, _s, _f) \
-DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint16_t)
+DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint32_t)
 #define DEFINE_PROP_PCI_HOST_DEVADDR(_n, _s, _f) \
 DEFINE_PROP(_n, _s, _f, qdev_prop_pci_host_devaddr, PCIHostDeviceAddress)
 #define DEFINE_PROP_OFF_AUTO_PCIBAR(_n, _s, _f, _d) \
diff --git a/hw/block/block.c b/hw/block/block.c
index b22207c921..1e34573da7 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -96,6 +96,16 @@ bool blkconf_blocksizes(BlockConf *conf, Error **errp)
 return false;
 }
 
+/*
+ * all devices which support min_io_size (scsi and virtio-blk) expose it to
+ * the guest as a uint16_t in units of logical blocks
+ */
+if (conf->min_io_size / conf->logical_block_size > UINT16_MAX) {
+error_setg(errp, "min_io_size must not exceed %u logical blocks",
+   UINT16_MAX);
+return false;
+}
+
 if (!QEMU_IS_ALIGNED(conf->opt_io_size, conf->logical_block_size)) {
 error_setg(errp,
"opt_io_size must be a multiple of logical_block_size");
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index c9af6a1341..bd4abdc1d1 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -782,7 +782,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
 uint64_t value;
 Error *local_err = NULL;
 
@@ -821,7 +821,7 @@ const PropertyInfo qdev_prop_blocksize = {
 .name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
-.get   = get_uint16,
+.get   = get_uint32,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
 };
-- 
2.26.2

[PATCH v8 7/8] qdev-properties: add getter for size32 and blocksize

2020-05-28 Thread Roman Kagan

Add getter for size32, and use it for blocksize, too.

In its human-readable branch, it reports approximate size in
human-readable units next to the exact byte value, like the getter for
64bit size does.

Adjust the expected test output accordingly.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
 hw/core/qdev-properties.c  |  15 +-
 tests/qemu-iotests/172.out | 530 ++---
 2 files changed, 278 insertions(+), 267 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index bd4abdc1d1..63d48db70c 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -730,6 +730,17 @@ const PropertyInfo qdev_prop_pci_devfn = {
 
 /* --- 32bit unsigned int 'size' type --- */
 
+static void get_size32(Object *obj, Visitor *v, const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value = *ptr;
+
+visit_type_size(v, name, , errp);
+}
+
 static void set_size32(Object *obj, Visitor *v, const char *name, void *opaque,
Error **errp)
 {
@@ -763,7 +774,7 @@ static void set_size32(Object *obj, Visitor *v, const char 
*name, void *opaque,
 
 const PropertyInfo qdev_prop_size32 = {
 .name  = "size",
-.get = get_uint32,
+.get = get_size32,
 .set = set_size32,
 .set_default_value = set_default_value_uint,
 };
@@ -821,7 +832,7 @@ const PropertyInfo qdev_prop_blocksize = {
 .name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
-.get   = get_uint32,
+.get   = get_size32,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
 };
diff --git a/tests/qemu-iotests/172.out b/tests/qemu-iotests/172.out
index 59cc70aebb..e782c5957e 100644
--- a/tests/qemu-iotests/172.out
+++ b/tests/qemu-iotests/172.out
@@ -24,11 +24,11 @@ Testing:
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "288"
@@ -54,11 +54,11 @@ Testing: -fda TEST_DIR/t.qcow2
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "144"
@@ -81,22 +81,22 @@ Testing: -fdb TEST_DIR/t.qcow2
   dev: floppy, id ""
 unit = 1 (0x1)
 drive = "floppy1"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "144"
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "288"
@@ -119,22 +119,22 @@ Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2.2
   dev: floppy, id ""

[PATCH v8 4/8] qdev-properties: add size32 property type

2020-05-28 Thread Roman Kagan

Introduce size32 property type which handles size suffixes (k, m, g)
just like size property, but is uint32_t rather than uint64_t.  It's
going to be useful for properties that are byte sizes but are inherently
32bit, like BlkConf.opt_io_size or .discard_granularity (they are
switched to this new property type in a followup commit).

The getter for size32 is left out for a separate patch as its benefit is
less obvious, and it affects test output; for now the regular uint32
getter is used.

Signed-off-by: Roman Kagan 
---
v7 -> v8:
- replace stringify with %u in the error message [Eric]
- fix wording in the log [Eric]

 include/hw/qdev-properties.h |  3 +++
 hw/core/qdev-properties.c| 40 
 2 files changed, 43 insertions(+)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index f161604fb6..c03eadfad6 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -29,6 +29,7 @@ extern const PropertyInfo qdev_prop_drive;
 extern const PropertyInfo qdev_prop_drive_iothread;
 extern const PropertyInfo qdev_prop_netdev;
 extern const PropertyInfo qdev_prop_pci_devfn;
+extern const PropertyInfo qdev_prop_size32;
 extern const PropertyInfo qdev_prop_blocksize;
 extern const PropertyInfo qdev_prop_pci_host_devaddr;
 extern const PropertyInfo qdev_prop_uuid;
@@ -196,6 +197,8 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
 BlockdevOnError)
 #define DEFINE_PROP_BIOS_CHS_TRANS(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_bios_chs_trans, int)
+#define DEFINE_PROP_SIZE32(_n, _s, _f, _d)   \
+DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size32, uint32_t)
 #define DEFINE_PROP_BLOCKSIZE(_n, _s, _f) \
 DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint16_t)
 #define DEFINE_PROP_PCI_HOST_DEVADDR(_n, _s, _f) \
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 249dc69bd8..40c13f6ebe 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -727,6 +727,46 @@ const PropertyInfo qdev_prop_pci_devfn = {
 .set_default_value = set_default_value_int,
 };
 
+/* --- 32bit unsigned int 'size' type --- */
+
+static void set_size32(Object *obj, Visitor *v, const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value;
+Error *local_err = NULL;
+
+if (dev->realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_size(v, name, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+if (value > UINT32_MAX) {
+error_setg(errp,
+   "Property %s.%s doesn't take value %" PRIu64
+   " (maximum: %u)",
+   dev->id ? : "", name, value, UINT32_MAX);
+return;
+}
+
+*ptr = value;
+}
+
+const PropertyInfo qdev_prop_size32 = {
+.name  = "size",
+.get = get_uint32,
+.set = set_size32,
+.set_default_value = set_default_value_uint,
+};
+
 /* --- blocksize --- */
 
 /* lower limit is sector size */
-- 
2.26.2

[PATCH v8 5/8] qdev-properties: make blocksize accept size suffixes

2020-05-28 Thread Roman Kagan

It appears convenient to be able to specify physical_block_size and
logical_block_size using common size suffixes.

Teach the blocksize property setter to interpret them.  Also express the
upper and lower limits in the respective units.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
 hw/core/qdev-properties.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 40c13f6ebe..c9af6a1341 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -14,6 +14,7 @@
 #include "qapi/visitor.h"
 #include "chardev/char.h"
 #include "qemu/uuid.h"
+#include "qemu/units.h"
 
 void qdev_prop_set_after_realize(DeviceState *dev, const char *name,
   Error **errp)
@@ -771,17 +772,18 @@ const PropertyInfo qdev_prop_size32 = {
 
 /* lower limit is sector size */
 #define MIN_BLOCK_SIZE  512
-#define MIN_BLOCK_SIZE_STR  stringify(MIN_BLOCK_SIZE)
+#define MIN_BLOCK_SIZE_STR  "512 B"
 /* upper limit is the max power of 2 that fits in uint16_t */
-#define MAX_BLOCK_SIZE  32768
-#define MAX_BLOCK_SIZE_STR  stringify(MAX_BLOCK_SIZE)
+#define MAX_BLOCK_SIZE  (32 * KiB)
+#define MAX_BLOCK_SIZE_STR  "32 KiB"
 
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-uint16_t value, *ptr = qdev_get_prop_ptr(dev, prop);
+uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value;
 Error *local_err = NULL;
 
 if (dev->realized) {
@@ -789,7 +791,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 return;
 }
 
-visit_type_uint16(v, name, , _err);
+visit_type_size(v, name, , _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -797,7 +799,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 /* value of 0 means "unset" */
 if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
 error_setg(errp,
-   "Property %s.%s doesn't take value %" PRIu16
+   "Property %s.%s doesn't take value %" PRIu64
" (minimum: " MIN_BLOCK_SIZE_STR
", maximum: " MAX_BLOCK_SIZE_STR ")",
dev->id ? : "", name, value);
@@ -816,7 +818,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 }
 
 const PropertyInfo qdev_prop_blocksize = {
-.name  = "uint16",
+.name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
 .get   = get_uint16,
-- 
2.26.2

[PATCH v8 2/8] block: consolidate blocksize properties consistency checks

2020-05-28 Thread Roman Kagan

Several block device properties related to blocksize configuration must
be in certain relationship WRT each other: physical block must be no
smaller than logical block; min_io_size, opt_io_size, and
discard_granularity must be a multiple of a logical block.

To ensure these requirements are met, add corresponding consistency
checks to blkconf_blocksizes, adjusting its signature to communicate
possible error to the caller.  Also remove the now redundant consistency
checks from the specific devices.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
Reviewed-by: Paul Durrant 
---
 include/hw/block/block.h   |  2 +-
 hw/block/block.c   | 30 +-
 hw/block/fdc.c |  5 -
 hw/block/nvme.c|  5 -
 hw/block/swim.c|  5 -
 hw/block/virtio-blk.c  |  7 +--
 hw/block/xen-block.c   |  6 +-
 hw/ide/qdev.c  |  5 -
 hw/scsi/scsi-disk.c| 12 +---
 hw/usb/dev-storage.c   |  5 -
 tests/qemu-iotests/172.out |  2 +-
 11 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index d7246f3862..784953a237 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -87,7 +87,7 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void 
*buf, hwaddr size,
 bool blkconf_geometry(BlockConf *conf, int *trans,
   unsigned cyls_max, unsigned heads_max, unsigned secs_max,
   Error **errp);
-void blkconf_blocksizes(BlockConf *conf);
+bool blkconf_blocksizes(BlockConf *conf, Error **errp);
 bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
bool resizable, Error **errp);
 
diff --git a/hw/block/block.c b/hw/block/block.c
index bf56c7612b..b22207c921 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -61,7 +61,7 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void 
*buf, hwaddr size,
 return true;
 }
 
-void blkconf_blocksizes(BlockConf *conf)
+bool blkconf_blocksizes(BlockConf *conf, Error **errp)
 {
 BlockBackend *blk = conf->blk;
 BlockSizes blocksizes;
@@ -83,6 +83,34 @@ void blkconf_blocksizes(BlockConf *conf)
 conf->logical_block_size = BDRV_SECTOR_SIZE;
 }
 }
+
+if (conf->logical_block_size > conf->physical_block_size) {
+error_setg(errp,
+   "logical_block_size > physical_block_size not supported");
+return false;
+}
+
+if (!QEMU_IS_ALIGNED(conf->min_io_size, conf->logical_block_size)) {
+error_setg(errp,
+   "min_io_size must be a multiple of logical_block_size");
+return false;
+}
+
+if (!QEMU_IS_ALIGNED(conf->opt_io_size, conf->logical_block_size)) {
+error_setg(errp,
+   "opt_io_size must be a multiple of logical_block_size");
+return false;
+}
+
+if (conf->discard_granularity != -1 &&
+!QEMU_IS_ALIGNED(conf->discard_granularity,
+ conf->logical_block_size)) {
+error_setg(errp, "discard_granularity must be "
+   "a multiple of logical_block_size");
+return false;
+}
+
+return true;
 }
 
 bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index c5fb9d6ece..8eda572ef4 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -554,7 +554,10 @@ static void floppy_drive_realize(DeviceState *qdev, Error 
**errp)
 read_only = !blk_bs(dev->conf.blk) || blk_is_read_only(dev->conf.blk);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (dev->conf.logical_block_size != 512 ||
 dev->conf.physical_block_size != 512)
 {
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 2f3100e56c..672650e162 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1390,7 +1390,10 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
 host_memory_backend_set_mapped(n->pmrdev, true);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (!blkconf_apply_backend_options(>conf, blk_is_read_only(n->conf.blk),
false, errp)) {
 return;
diff --git a/hw/block/swim.c b/hw/block/swim.c
index 8f124782f4..74f56e8f46 100644
--- a/hw/block/swim.c
+++ b/hw/block/swim.c
@@ -189,7 +189,10 @@ static void swim_drive_realize(DeviceState *qdev, Error 
**errp)
 assert(ret == 0);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (dev->conf.logical_block_size != 512 ||
 dev->conf.physical_block_size != 512)
 {
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 413083e62f..4ffdb130be 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1162,12 +1162,7 @@ static

[PATCH v8 1/8] virtio-blk: store opt_io_size with correct size

2020-05-28 Thread Roman Kagan

The width of opt_io_size in virtio_blk_config is 32bit.  However, it's
written with virtio_stw_p; this may result in value truncation, and on
big-endian systems with legacy virtio in completely bogus readings in
the guest.

Use the appropriate accessor to store it.

Signed-off-by: Roman Kagan 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kevin Wolf 
---
 hw/block/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f5f6fc925e..413083e62f 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -918,7 +918,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 virtio_stw_p(vdev, , conf->cyls);
 virtio_stl_p(vdev, _size, blk_size);
 virtio_stw_p(vdev, _io_size, conf->min_io_size / blk_size);
-virtio_stw_p(vdev, _io_size, conf->opt_io_size / blk_size);
+virtio_stl_p(vdev, _io_size, conf->opt_io_size / blk_size);
 blkcfg.geometry.heads = conf->heads;
 /*
  * We must ensure that the block device capacity is a multiple of
-- 
2.26.2

[PATCH v8 3/8] qdev-properties: blocksize: use same limits in code and description

2020-05-28 Thread Roman Kagan

Make it easier (more visible) to maintain the limits on the blocksize
properties in sync with the respective description, by using macros both
in the code and in the description.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
 hw/core/qdev-properties.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index cc924815da..249dc69bd8 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -729,6 +729,13 @@ const PropertyInfo qdev_prop_pci_devfn = {
 
 /* --- blocksize --- */
 
+/* lower limit is sector size */
+#define MIN_BLOCK_SIZE  512
+#define MIN_BLOCK_SIZE_STR  stringify(MIN_BLOCK_SIZE)
+/* upper limit is the max power of 2 that fits in uint16_t */
+#define MAX_BLOCK_SIZE  32768
+#define MAX_BLOCK_SIZE_STR  stringify(MAX_BLOCK_SIZE)
+
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -736,8 +743,6 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 Property *prop = opaque;
 uint16_t value, *ptr = qdev_get_prop_ptr(dev, prop);
 Error *local_err = NULL;
-const int64_t min = 512;
-const int64_t max = 32768;
 
 if (dev->realized) {
 qdev_prop_set_after_realize(dev, name, errp);
@@ -750,9 +755,12 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 return;
 }
 /* value of 0 means "unset" */
-if (value && (value < min || value > max)) {
-error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
-   dev->id ? : "", name, (int64_t)value, min, max);
+if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
+error_setg(errp,
+   "Property %s.%s doesn't take value %" PRIu16
+   " (minimum: " MIN_BLOCK_SIZE_STR
+   ", maximum: " MAX_BLOCK_SIZE_STR ")",
+   dev->id ? : "", name, value);
 return;
 }
 
@@ -769,7 +777,8 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 
 const PropertyInfo qdev_prop_blocksize = {
 .name  = "uint16",
-.description = "A power of two between 512 and 32768",
+.description = "A power of two between " MIN_BLOCK_SIZE_STR
+   " and " MAX_BLOCK_SIZE_STR,
 .get   = get_uint16,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
-- 
2.26.2

[PATCH v8 0/8] block: enhance handling of size-related BlockConf properties

2020-05-28 Thread Roman Kagan

BlockConf includes several properties counted in bytes.

Enhance their handling in some aspects, specifically

- accept common size suffixes (k, m)
- perform consistency checks on the values
- lift the upper limit on physical_block_size and logical_block_size

Also fix the accessor for opt_io_size in virtio-blk to make it consistent with
the size of the field.

History:
v7 -> v8:
- replace stringify with %u in error messages [Eric]
- fix wording in logs [Eric]

v6 -> v7:
- avoid overflow in min_io_size check [Eric]
- try again to perform the art form in patch splitting [Eric]

v5 -> v6:
- fix forgotten xen-block and swim
- add prop_size32 instead of going with 64bit

v4 -> v5:
- re-split the patches [Philippe]
- fix/reword error messages [Philippe, Kevin]
- do early return on failed consistency check [Philippe]
- use QEMU_IS_ALIGNED instead of open coding [Philippe]
- make all BlockConf size props support suffixes
- expand the log for virtio-blk opt_io_size [Michael]

v3 -> v4:
- add patch to fix opt_io_size width in virtio-blk
- add patch to perform consistency checks [Kevin]
- check min_io_size against truncation [Kevin]

v2 -> v3:
- mention qcow2 cluster size limit in the log and comment [Eric]

v1 -> v2:
- cap the property at 2 MiB [Eric]
- accept size suffixes

Roman Kagan (8):
  virtio-blk: store opt_io_size with correct size
  block: consolidate blocksize properties consistency checks
  qdev-properties: blocksize: use same limits in code and description
  qdev-properties: add size32 property type
  qdev-properties: make blocksize accept size suffixes
  block: make BlockConf size props 32bit and accept size suffixes
  qdev-properties: add getter for size32 and blocksize
  block: lift blocksize property limit to 2 MiB

 include/hw/block/block.h |  14 +-
 include/hw/qdev-properties.h |   5 +-
 hw/block/block.c |  40 ++-
 hw/block/fdc.c   |   5 +-
 hw/block/nvme.c  |   5 +-
 hw/block/swim.c  |   5 +-
 hw/block/virtio-blk.c|   9 +-
 hw/block/xen-block.c |   6 +-
 hw/core/qdev-properties.c|  85 +-
 hw/ide/qdev.c|   5 +-
 hw/scsi/scsi-disk.c  |  12 +-
 hw/usb/dev-storage.c |   5 +-
 tests/qemu-iotests/172.out   | 532 +--
 13 files changed, 419 insertions(+), 309 deletions(-)

-- 
2.26.2

Re: [PATCH Kernel v22 0/8] Add UAPIs to support migration for VFIO devices

2020-05-28 Thread Alex Williamson

On Thu, 28 May 2020 04:01:02 -0400
Yan Zhao  wrote:

> > > > This is my understanding of the protocol as well, when the device is
> > > > running, pending_bytes might drop to zero if no internal state has
> > > > changed and may be non-zero on the next iteration due to device
> > > > activity.  When the device is not running, pending_bytes reporting zero
> > > > indicates the device is done, there is no further state to transmit.
> > > > Does that meet your need/expectation?
> > > >  
> > > (1) on one side, as in vfio_save_pending(),
> > > vfio_save_pending()
> > > {
> > > ...
> > > ret = vfio_update_pending(vbasedev);
> > > ...
> > > *res_precopy_only += migration->pending_bytes;
> > > ...
> > > }
> > > the pending_bytes tells migration thread how much data is still hold in
> > > device side.
> > > the device data includes
> > > device internal data + running device dirty data + device state.
> > > 
> > > so the pending_bytes should include device state as well, right?
> > > if so, the pending_bytes should never reach 0 if there's any device
> > > state to be sent after device is stopped.  
> > 
> > I hadn't expected the pending-bytes to include a fixed offset for device
> > state (If you mean a few registers etc) - I'd expect pending to drop
> > possibly to zero;  the heuristic as to when to switch from iteration to
> > stop, is based on the total pending across all iterated devices; so it's
> > got to be allowed to drop otherwise you'll never transition to stop.
> >   
> ok. got it.

Yeah, as I understand it, a device is not required to participate in
reporting data available while (_SAVING | _RUNNING), there will always
be an iteration while the device is !_RUNNING.  Therefore if you have
fixed device state that you're always going to send, it should only be
sent once when called during !_RUNNING.  The iterative phase should be
used where you have a good chance to avoid re-sending data at the
stop-and-copy phase.  Thanks,

Alex

[PATCH v4 0/3] account for NVDIMM nodes during SRAT generation

2020-05-28 Thread Vishal Verma

Changes since v3:
- Add the SRAT augmentation for ARM's virt-acpi-build as well (Igor)
- Update patches 1 and 3 for the test binaries to include ARM tests.

Changes since v2:
- Change a repetitive OBJECT(dev) to a stored 'Object' (Igor)
- No need to return 'numamem' back to build_srat (Igor)

Changes since v1:
- Use error_abort for getters (Igor)
- Free the device list (Igor)
- Refactor the NVDIMM related portion into hw/acpi/nvdimm.c (Igor)
- Rebase onto latest master
- Add Jingqi's Reviewed-by

On the command line, one can specify a NUMA node for NVDIMM devices. If
we set up the topology to give NVDIMMs their own nodes, i.e. not
containing any CPUs or regular memory, qemu doesn't populate SRAT memory
affinity structures for these nodes. However the NFIT does reference
those proximity domains.

As a result, Linux, while parsing the SRAT, fails to initialize node
related structures for these nodes, and they never end up in the
nodes_possible map. When these are onlined at a later point (via
hotplug), this causes problems.

I've followed the instructions in bios-tables-test.c to update the
expected SRAT binary, and the tests (make check) pass. Patches 1 and 3
are the relevant ones for the binary update.

Patch 2 is the main patch which changes SRAT generation.


Vishal Verma (3):
  diffs-allowed: add the SRAT AML to diffs-allowed
  hw/acpi/nvdimm: add a helper to augment SRAT generation
  tests/acpi: update expected SRAT files

 hw/acpi/nvdimm.c |  23 +++
 hw/arm/virt-acpi-build.c |   4 
 hw/i386/acpi-build.c |   5 +
 include/hw/mem/nvdimm.h  |   1 +
 tests/data/acpi/pc/SRAT.dimmpxm  | Bin 392 -> 392 bytes
 tests/data/acpi/q35/SRAT.dimmpxm | Bin 392 -> 392 bytes
 tests/data/acpi/virt/SRAT.memhp  | Bin 186 -> 226 bytes
 7 files changed, 33 insertions(+)

-- 
2.26.2

[PATCH v4 1/3] diffs-allowed: add the SRAT AML to diffs-allowed

2020-05-28 Thread Vishal Verma

In anticipation of a change to the SRAT generation in qemu, add the AML
file to diffs-allowed.

Signed-off-by: Vishal Verma 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..e8f2766a63 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/SRAT.dimmpxm",
+"tests/data/acpi/q35/SRAT.dimmpxm",
+"tests/data/acpi/virt/SRAT.memhp",
-- 
2.26.2

Re: [PATCH v7 4/8] qdev-properties: add size32 property type

2020-05-28 Thread Roman Kagan

On Thu, May 28, 2020 at 04:45:19PM -0500, Eric Blake wrote:
> On 5/28/20 4:39 PM, Roman Kagan wrote:
> > Introduce size32 property type which handles size suffixes (k, m) just
> > like size property, but is uint32_t rather than uint64_t.
> 
> Does it handle 'g' as well? (even though the set of valid 32-bit sizes with
> a g suffix is rather small ;)
> 
> >  It's going to
> > be useful for properties that are byte sizes but are inherently 32bit,
> > like BlkConf.opt_io_size or .discard_granularity (they are switched to
> > this new property type in a followup commit).
> > 
> > The getter for size32 is left out for a separate patch as its benefit is
> > less obvious, and it affects test output; for now the regular uint32
> > getter is used.
> > 
> > Signed-off-by: Roman Kagan 
> > ---
> > 
> 
> > +static void set_size32(Object *obj, Visitor *v, const char *name, void 
> > *opaque,
> > +   Error **errp)
> > +{
> > +DeviceState *dev = DEVICE(obj);
> > +Property *prop = opaque;
> > +uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
> > +uint64_t value;
> > +Error *local_err = NULL;
> > +
> > +if (dev->realized) {
> > +qdev_prop_set_after_realize(dev, name, errp);
> > +return;
> > +}
> > +
> > +visit_type_size(v, name, , _err);
> 
> Yes, it does.
> 
> Whether or not the commit message is tweaked,
> Reviewed-by: Eric Blake 

I did this stupid stringify(UINT32_MAX) here too.  It's even uglier
here, with an 'U' appended to the number in the brackets, but somehow it
didn't strike me in the eye while testing.

So I'll fix this too in the respin, and drop the r-b.

Thanks,
Roman.

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-28 Thread Alex Williamson

On Wed, 27 May 2020 12:53:30 -0400
Peter Xu  wrote:

> On Wed, May 27, 2020 at 06:27:38PM +0200, Philippe Mathieu-Daudé wrote:
> > On 5/27/20 6:16 PM, Peter Xu wrote:  
> > > On Wed, May 27, 2020 at 05:53:16PM +0200, Philippe Mathieu-Daudé wrote:  
> > > +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
> > > +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, 
> > > iommu[i].type)) {
> > > +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name); 
> > >  
> >  Just wondering why you want to trace the type as you now have the name
> >  string.  
> > >>>
> > >>> You are right :)
> > >>>  
> > > +return iommu[i].type;
> > >  }
> > >  }
> > > +trace_vfio_get_iommu_type(-1, "Not available or not supported"); 
> > >  
> >  nit: from a debugging pov, this may be not needed as
> >  vfio_get_group/vfio_connect_container() fails and this leads to an 
> >  error
> >  output.  
> > >>
> > >> But you can reach this for example using No-IOMMU. If you don't mind, I
> > >> find having this information in the trace log clearer.  
> > > 
> > > I kinda agree with Eric - AFAICT QEMU vfio-pci don't work with no-iommu, 
> > > then
> > > it seems meaningless to trace it...
> > > 
> > > I'm not sure whether this trace is extremely helpful because syscalls 
> > > like this
> > > could be easily traced by things like strace or bpftrace as general tools 
> > > (and
> > > this information should be a one-time thing rather than dynamically 
> > > changing),
> > > no strong opinion though.  Also, if we want to dump something, maybe it's
> > > better to do in vfio_init_container() after vfio_get_iommu_type() 
> > > succeeded, so
> > > we dump which container is enabled with what type of iommu.  
> > 
> > OK. I'm a recent VFIO user so maybe I am not using the good information.
> > 
> > This trace helps me while working on a new device feature, I didn't
> > thought about gathering it in a production because there I'd expect
> > things to work.
> > 
> > Now in my case what I want is to know is if I'm using a v1 or v2 type.
> > Maybe this information is already available in /proc or /sys and we
> > don't need this patch...  

You're using v2 unless you're on a very old kernel.

> I don't know such /proc or /sys, so maybe it's still useful. I guess Alex 
> would
> have the best judgement. The strace/bpftrace things are not really reasons I
> found to nak this patch, but just something I thought first that could be
> easier when any of us wants to peak at those information, probably something
> just FYI. :-)

I appreciate good trace code, but I don't appreciate code bloat for the
sake of tracing, which is what I'd consider the name fields here.  Do
it in the trace-event or require that the user needs to cross reference
the header to turn the integer type into a name themselves.  Thanks,

Alex

[PATCH v4 3/3] tests/acpi: update expected SRAT files

2020-05-28 Thread Vishal Verma

Update expected SRAT files for the change to account for NVDIMM NUMA
nodes in the SRAT.

AML diffs:

tests/data/acpi/pc/SRAT.dimmpxm:
--- /tmp/asl-3P2IL0.dsl 2020-05-28 15:11:02.326439263 -0600
+++ /tmp/asl-1N4IL0.dsl 2020-05-28 15:11:02.325439280 -0600
@@ -3,7 +3,7 @@
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
- * Disassembly of tests/data/acpi/pc/SRAT.dimmpxm, Thu May 28 15:11:02 2020
+ * Disassembly of /tmp/aml-4D4IL0, Thu May 28 15:11:02 2020
  *
  * ACPI Data Table [SRAT]
  *
@@ -13,7 +13,7 @@
 [000h    4]Signature : "SRAT"[System Resource 
Affinity Table]
 [004h 0004   4] Table Length : 0188
 [008h 0008   1] Revision : 01
-[009h 0009   1] Checksum : 80
+[009h 0009   1] Checksum : 68
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPCSRAT"
 [018h 0024   4] Oem Revision : 0001
@@ -140,15 +140,15 @@
 [138h 0312   1]Subtable Type : 01 [Memory Affinity]
 [139h 0313   1]   Length : 28

-[13Ah 0314   4] Proximity Domain : 
+[13Ah 0314   4] Proximity Domain : 0002
 [13Eh 0318   2]Reserved1 : 
-[140h 0320   8] Base Address : 
-[148h 0328   8]   Address Length : 
+[140h 0320   8] Base Address : 00010800
+[148h 0328   8]   Address Length : 0800
 [150h 0336   4]Reserved2 : 
-[154h 0340   4]Flags (decoded below) : 
- Enabled : 0
+[154h 0340   4]Flags (decoded below) : 0005
+ Enabled : 1
Hot Pluggable : 0
-Non-Volatile : 0
+Non-Volatile : 1
 [158h 0344   8]Reserved3 : 

 [160h 0352   1]Subtable Type : 01 [Memory Affinity]

tests/data/acpi/q35/SRAT.dimmpxm:
--- /tmp/asl-HW2LL0.dsl 2020-05-28 15:11:05.446384514 -0600
+++ /tmp/asl-8MYLL0.dsl 2020-05-28 15:11:05.445384532 -0600
@@ -3,7 +3,7 @@
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
- * Disassembly of tests/data/acpi/q35/SRAT.dimmpxm, Thu May 28 15:11:05 2020
+ * Disassembly of /tmp/aml-2CYLL0, Thu May 28 15:11:05 2020
  *
  * ACPI Data Table [SRAT]
  *
@@ -13,7 +13,7 @@
 [000h    4]Signature : "SRAT"[System Resource 
Affinity Table]
 [004h 0004   4] Table Length : 0188
 [008h 0008   1] Revision : 01
-[009h 0009   1] Checksum : 80
+[009h 0009   1] Checksum : 68
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPCSRAT"
 [018h 0024   4] Oem Revision : 0001
@@ -140,15 +140,15 @@
 [138h 0312   1]Subtable Type : 01 [Memory Affinity]
 [139h 0313   1]   Length : 28

-[13Ah 0314   4] Proximity Domain : 
+[13Ah 0314   4] Proximity Domain : 0002
 [13Eh 0318   2]Reserved1 : 
-[140h 0320   8] Base Address : 
-[148h 0328   8]   Address Length : 
+[140h 0320   8] Base Address : 00010800
+[148h 0328   8]   Address Length : 0800
 [150h 0336   4]Reserved2 : 
-[154h 0340   4]Flags (decoded below) : 
- Enabled : 0
+[154h 0340   4]Flags (decoded below) : 0005
+ Enabled : 1
Hot Pluggable : 0
-Non-Volatile : 0
+Non-Volatile : 1
 [158h 0344   8]Reserved3 : 

 [160h 0352   1]Subtable Type : 01 [Memory Affinity]

tests/data/acpi/virt/SRAT.memhp:
--- /tmp/asl-E32WL0.dsl 2020-05-28 15:19:56.976095582 -0600
+++ /tmp/asl-Y69WL0.dsl 2020-05-28 15:19:56.974095617 -0600
@@ -3,7 +3,7 @@
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
- * Disassembly of tests/data/acpi/virt/SRAT.memhp, Thu May 28 15:19:56 2020
+ * Disassembly of /tmp/aml-2CCXL0, Thu May 28 15:19:56 2020
  *
  * ACPI Data Table [SRAT]
  *
@@ -11,9 +11,9 @@
  */

 [000h    4]Signature : "SRAT"[System Resource 
Affinity Table]
-[004h 0004   4] Table Length : 00BA
+[004h 0004   4] Table Length : 00E2
 [008h 0008   1] Revision : 03
-[009h 0009

[Bug 1877418] Re: qemu-nbd freezes access to VDI file

2020-05-28 Thread John Snow

I don't recommend you use VDI images in this way; we do not intend to
support performant RW access; support for VDI images is there to convert
to qcow2 or raw, generally.

That said, some questions that might be interesting to know the answer
to:

- Try converting your VDI image to raw or qcow2 and mounting that instead. Does 
the conversion work successfully? Can you export that image via qemu-nbd and 
mount it? Does it work?
- Do non-BTRFS filesystems cause any problems?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877418

Title:
  qemu-nbd freezes access to VDI file

Status in QEMU:
  New
Status in btrfs-progs package in Ubuntu:
  New

Bug description:
  Mounted Oracle Virtualbox .vdi drive (dynamically allocated), which has 
GTP+BTRFS:
  sudo modprobe nbd max_part=16
  sudo qemu-nbd -c /dev/nbd0 /storage/btrfs.vdi
  mount /dev/nbd0p1 /mydata/

  Then I am operating on the btrfs filesystem and suddenly it freezes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877418/+subscriptions

[PATCH v4 2/4] target/riscv: Remove the deprecated CPUs

2020-05-28 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 docs/system/deprecated.rst  | 33 ++---
 target/riscv/cpu.h  |  7 ---
 target/riscv/cpu.c  | 28 
 tests/qtest/machine-none-test.c |  4 ++--
 4 files changed, 20 insertions(+), 52 deletions(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 50927bad74..a6664bfca9 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -314,21 +314,6 @@ should be used instead of the 1.09.1 version.
 System emulator CPUS
 
 
-RISC-V ISA CPUs (since 4.1)
-'''
-
-The RISC-V cpus with the ISA version in the CPU name have been depcreated. The
-four CPUs are: ``rv32gcsu-v1.9.1``, ``rv32gcsu-v1.10.0``, ``rv64gcsu-v1.9.1`` 
and
-``rv64gcsu-v1.10.0``. Instead the version can be specified via the CPU 
``priv_spec``
-option when using the ``rv32`` or ``rv64`` CPUs.
-
-RISC-V ISA CPUs (since 4.1)
-'''
-
-The RISC-V no MMU cpus have been depcreated. The two CPUs: ``rv32imacu-nommu`` 
and
-``rv64imacu-nommu`` should no longer be used. Instead the MMU status can be 
specified
-via the CPU ``mmu`` option when using the ``rv32`` or ``rv64`` CPUs.
-
 ``compat`` property of server class POWER CPUs (since 5.0)
 ''
 
@@ -486,6 +471,24 @@ The ``hub_id`` parameter of ``hostfwd_add`` / 
``hostfwd_remove`` (removed in 5.0
 The ``[hub_id name]`` parameter tuple of the 'hostfwd_add' and
 'hostfwd_remove' HMP commands has been replaced by ``netdev_id``.
 
+System emulator CPUS
+
+
+RISC-V ISA CPUs (removed in 5.1)
+
+
+The RISC-V cpus with the ISA version in the CPU name have been removed. The
+four CPUs are: ``rv32gcsu-v1.9.1``, ``rv32gcsu-v1.10.0``, ``rv64gcsu-v1.9.1`` 
and
+``rv64gcsu-v1.10.0``. Instead the version can be specified via the CPU 
``priv_spec``
+option when using the ``rv32`` or ``rv64`` CPUs.
+
+RISC-V ISA CPUs (removed in 5.1)
+
+
+The RISC-V no MMU cpus have been removed. The two CPUs: ``rv32imacu-nommu`` and
+``rv64imacu-nommu`` can no longer be used. Instead the MMU status can be 
specified
+via the CPU ``mmu`` option when using the ``rv32`` or ``rv64`` CPUs.
+
 System emulator machines
 
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d0e7f5b9c5..76b98d7a33 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -40,13 +40,6 @@
 #define TYPE_RISCV_CPU_SIFIVE_E51   RISCV_CPU_TYPE_NAME("sifive-e51")
 #define TYPE_RISCV_CPU_SIFIVE_U34   RISCV_CPU_TYPE_NAME("sifive-u34")
 #define TYPE_RISCV_CPU_SIFIVE_U54   RISCV_CPU_TYPE_NAME("sifive-u54")
-/* Deprecated */
-#define TYPE_RISCV_CPU_RV32IMACU_NOMMU  RISCV_CPU_TYPE_NAME("rv32imacu-nommu")
-#define TYPE_RISCV_CPU_RV32GCSU_V1_09_1 RISCV_CPU_TYPE_NAME("rv32gcsu-v1.9.1")
-#define TYPE_RISCV_CPU_RV32GCSU_V1_10_0 RISCV_CPU_TYPE_NAME("rv32gcsu-v1.10.0")
-#define TYPE_RISCV_CPU_RV64IMACU_NOMMU  RISCV_CPU_TYPE_NAME("rv64imacu-nommu")
-#define TYPE_RISCV_CPU_RV64GCSU_V1_09_1 RISCV_CPU_TYPE_NAME("rv64gcsu-v1.9.1")
-#define TYPE_RISCV_CPU_RV64GCSU_V1_10_0 RISCV_CPU_TYPE_NAME("rv64gcsu-v1.10.0")
 
 #define RV32 ((target_ulong)1 << (TARGET_LONG_BITS - 2))
 #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 059d71f2c7..112f2e3a2f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -135,16 +135,6 @@ static void riscv_base32_cpu_init(Object *obj)
 set_misa(env, 0);
 }
 
-static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
-{
-CPURISCVState *env = _CPU(obj)->env;
-set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
-set_priv_version(env, PRIV_VERSION_1_09_1);
-set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
-set_feature(env, RISCV_FEATURE_PMP);
-}
-
 static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 {
 CPURISCVState *env = _CPU(obj)->env;
@@ -182,16 +172,6 @@ static void riscv_base64_cpu_init(Object *obj)
 set_misa(env, 0);
 }
 
-static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
-{
-CPURISCVState *env = _CPU(obj)->env;
-set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
-set_priv_version(env, PRIV_VERSION_1_09_1);
-set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
-set_feature(env, RISCV_FEATURE_PMP);
-}
-
 static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
 {
 CPURISCVState *env = _CPU(obj)->env;
@@ -621,18 +601,10 @@ static const TypeInfo riscv_cpu_type_infos[] = {
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E31,   rv32imacu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E34,   rv32imafcu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U34,   rv32gcsu_priv1_10_0_cpu_init),
-/* Depreacted */
-

[PATCH v4 0/4] RISC-V: Remove deprecated ISA, CPUs and machines

2020-05-28 Thread Alistair Francis



v4:
 - Remove all of the < PRIV_VERSION_1_10_0 checks
 - Move the documentation to the "Recently removed features" section
 - Document the OpenSBI deprecation
v3:
 - Don't use SiFive CPUs for Spike machine
v2:
 - Remove the CPUs and ISA seperatley


Alistair Francis (4):
  hw/riscv: spike: Remove deprecated ISA specific machines
  target/riscv: Remove the deprecated CPUs
  target/riscv: Drop support for ISA spec version 1.09.1
  docs: deprecated: Update the -bios documentation

 docs/system/deprecated.rst|  98 
 include/hw/riscv/spike.h  |   6 +-
 target/riscv/cpu.h|   8 -
 hw/riscv/spike.c  | 217 --
 target/riscv/cpu.c|  30 ---
 target/riscv/cpu_helper.c |  82 +++
 target/riscv/csr.c| 138 ++-
 .../riscv/insn_trans/trans_privileged.inc.c   |  18 +-
 target/riscv/monitor.c|   5 -
 target/riscv/op_helper.c  |  17 +-
 tests/qtest/machine-none-test.c   |   4 +-
 11 files changed, 118 insertions(+), 505 deletions(-)

-- 
2.26.2

[PATCH v4 2/3] hw/acpi/nvdimm: add a helper to augment SRAT generation

2020-05-28 Thread Vishal Verma

NVDIMMs can belong to their own proximity domains, as described by the
NFIT. In such cases, the SRAT needs to have Memory Affinity structures
in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node
data structures properly during NUMA initialization. See the following
for an example failure case.

https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.ve...@intel.com/

Introduce a new helper, nvdimm_build_srat(), and call it for both the
i386 and arm versions of 'build_srat()' to augment the SRAT with
memory affinity information for NVDIMMs.

The relevant command line options to exercise this are below. Nodes 0-1
contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address
space.

  -numa node,nodeid=0,mem=2048M,
  -numa node,nodeid=1,mem=2048M,
  -numa node,nodeid=2,mem=0,
  -object 
memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M
  -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2
  -numa node,nodeid=3,mem=0,
  -object 
memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M
  -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3

Cc: Jingqi Liu 
Cc: Michael S. Tsirkin 
Reviewed-by: Jingqi Liu 
Signed-off-by: Vishal Verma 
---
 hw/acpi/nvdimm.c | 23 +++
 hw/arm/virt-acpi-build.c |  4 
 hw/i386/acpi-build.c |  5 +
 include/hw/mem/nvdimm.h  |  1 +
 4 files changed, 33 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 9316d12b70..8f7cc16add 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -28,6 +28,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/uuid.h"
+#include "qapi/error.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
@@ -1334,6 +1335,28 @@ static void nvdimm_build_ssdt(GArray *table_offsets, 
GArray *table_data,
 free_aml_allocator();
 }
 
+void nvdimm_build_srat(GArray *table_data)
+{
+GSList *device_list = nvdimm_get_device_list();
+
+for (; device_list; device_list = device_list->next) {
+AcpiSratMemoryAffinity *numamem = NULL;
+DeviceState *dev = device_list->data;
+Object *obj = OBJECT(dev);
+uint64_t addr, size;
+int node;
+
+node = object_property_get_int(obj, PC_DIMM_NODE_PROP, _abort);
+addr = object_property_get_uint(obj, PC_DIMM_ADDR_PROP, _abort);
+size = object_property_get_uint(obj, PC_DIMM_SIZE_PROP, _abort);
+
+numamem = acpi_data_push(table_data, sizeof *numamem);
+build_srat_memory(numamem, addr, size, node,
+  MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
+}
+g_slist_free(device_list);
+}
+
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
BIOSLinker *linker, NVDIMMState *state,
uint32_t ram_slots)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 1b0a584c7b..2cbccd5fe2 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -539,6 +539,10 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 }
 }
 
+if (ms->nvdimms_state->is_enabled) {
+nvdimm_build_srat(table_data);
+}
+
 if (ms->device_memory) {
 numamem = acpi_data_push(table_data, sizeof *numamem);
 build_srat_memory(numamem, ms->device_memory->base,
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2e15f6848e..d996525e2c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2428,6 +2428,11 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
   MEM_AFFINITY_ENABLED);
 }
 }
+
+if (machine->nvdimms_state->is_enabled) {
+nvdimm_build_srat(table_data);
+}
+
 slots = (table_data->len - numa_start) / sizeof *numamem;
 for (; slots < pcms->numa_nodes + 2; slots++) {
 numamem = acpi_data_push(table_data, sizeof *numamem);
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index a3c08955e8..b67a1aedf6 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -155,6 +155,7 @@ typedef struct NVDIMMState NVDIMMState;
 void nvdimm_init_acpi_state(NVDIMMState *state, MemoryRegion *io,
 struct AcpiGenericAddress dsm_io,
 FWCfgState *fw_cfg, Object *owner);
+void nvdimm_build_srat(GArray *table_data);
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
BIOSLinker *linker, NVDIMMState *state,
uint32_t ram_slots);
-- 
2.26.2

[PATCH v4 4/4] docs: deprecated: Update the -bios documentation

2020-05-28 Thread Alistair Francis

Update the -bios deprecation documentation to describe the new
behaviour.

Signed-off-by: Alistair Francis 
---
 docs/system/deprecated.rst | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 38865daafc..8c445d4062 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -138,25 +138,23 @@ the backing storage specified with ``-mem-path`` can 
actually provide
 the guest RAM configured with ``-m`` and QEMU will fail to start up if
 RAM allocation is unsuccessful.
 
-RISC-V ``-bios`` (since 4.1)
+RISC-V ``-bios`` (since 5.1)
 
 
 QEMU 4.1 introduced support for the -bios option in QEMU for RISC-V for the
-RISC-V virt machine and sifive_u machine.
-
-QEMU 4.1 has no changes to the default behaviour to avoid breakages. This
-default will change in a future QEMU release, so please prepare now. All users
-of the virt or sifive_u machine must change their command line usage.
-
-QEMU 4.1 has three options, please migrate to one of these three:
- 1. ``-bios none`` - This is the current default behavior if no -bios option
-  is included. QEMU will not automatically load any firmware. It is up
+RISC-V virt machine and sifive_u machine. QEMU 4.1 had no changes to the
+default behaviour to avoid breakages.
+
+QEMU 5.1 changes the default behaviour from ``-bios none`` to ``-bios 
default``.
+
+QEMU 5.1 has three options:
+ 1. ``-bios default`` - This is the current default behavior if no -bios option
+  is included. This option will load the default OpenSBI firmware 
automatically.
+  The firmware is included with the QEMU release and no user interaction is
+  required. All a user needs to do is specify the kernel they want to boot
+  with the -kernel option
+ 2. ``-bios none`` - QEMU will not automatically load any firmware. It is up
   to the user to load all the images they need.
- 2. ``-bios default`` - In a future QEMU release this will become the default
-  behaviour if no -bios option is specified. This option will load the
-  default OpenSBI firmware automatically. The firmware is included with
-  the QEMU release and no user interaction is required. All a user needs
-  to do is specify the kernel they want to boot with the -kernel option
  3. ``-bios `` - Tells QEMU to load the specified file as the firmwrae.
 
 ``-tb-size`` option (since 5.0)
-- 
2.26.2

[PATCH v5 09/11] riscv/opentitan: Connect the PLIC device

2020-05-28 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/riscv/opentitan.h |  3 +++
 hw/riscv/opentitan.c | 19 +--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index a4b6499444..76f72905a8 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -20,6 +20,7 @@
 #define HW_OPENTITAN_H
 
 #include "hw/riscv/riscv_hart.h"
+#include "hw/intc/ibex_plic.h"
 
 #define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
 #define RISCV_IBEX_SOC(obj) \
@@ -31,6 +32,8 @@ typedef struct LowRISCIbexSoCState {
 
 /*< public >*/
 RISCVHartArrayState cpus;
+IbexPlicState plic;
+
 MemoryRegion flash_mem;
 MemoryRegion rom;
 } LowRISCIbexSoCState;
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index b4fb836466..46a3a93c5e 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -25,6 +25,7 @@
 #include "hw/misc/unimp.h"
 #include "hw/riscv/boot.h"
 #include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
 
 static const struct MemmapEntry {
 hwaddr base;
@@ -97,6 +98,9 @@ static void riscv_lowrisc_ibex_soc_init(Object *obj)
 object_initialize_child(obj, "cpus", >cpus,
 sizeof(s->cpus), TYPE_RISCV_HART_ARRAY,
 _abort, NULL);
+
+sysbus_init_child_obj(obj, "plic", >plic,
+  sizeof(s->plic), TYPE_IBEX_PLIC);
 }
 
 static void riscv_lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp)
@@ -105,6 +109,9 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 MachineState *ms = MACHINE(qdev_get_machine());
 LowRISCIbexSoCState *s = RISCV_IBEX_SOC(dev_soc);
 MemoryRegion *sys_mem = get_system_memory();
+DeviceState *dev;
+SysBusDevice *busdev;
+Error *err = NULL;
 
 object_property_set_str(OBJECT(>cpus), ms->cpu_type, "cpu-type",
 _abort);
@@ -125,6 +132,16 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 memory_region_add_subregion(sys_mem, memmap[IBEX_FLASH].base,
 >flash_mem);
 
+/* PLIC */
+dev = DEVICE(>plic);
+object_property_set_bool(OBJECT(>plic), true, "realized", );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+busdev = SYS_BUS_DEVICE(dev);
+sysbus_mmio_map(busdev, 0, memmap[IBEX_PLIC].base);
+
 create_unimplemented_device("riscv.lowrisc.ibex.uart",
 memmap[IBEX_UART].base, memmap[IBEX_UART].size);
 create_unimplemented_device("riscv.lowrisc.ibex.gpio",
@@ -145,8 +162,6 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 memmap[IBEX_AES].base, memmap[IBEX_AES].size);
 create_unimplemented_device("riscv.lowrisc.ibex.hmac",
 memmap[IBEX_HMAC].base, memmap[IBEX_HMAC].size);
-create_unimplemented_device("riscv.lowrisc.ibex.plic",
-memmap[IBEX_PLIC].base, memmap[IBEX_PLIC].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_PINMUX].base, memmap[IBEX_PINMUX].size);
 create_unimplemented_device("riscv.lowrisc.ibex.alert_handler",
-- 
2.26.2

[PATCH v4 1/4] hw/riscv: spike: Remove deprecated ISA specific machines

2020-05-28 Thread Alistair Francis

The ISA specific Spike machines have been deprecated in QEMU since 4.1,
let's finally remove them.

Signed-off-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bin Meng 
---
 docs/system/deprecated.rst |  17 +--
 include/hw/riscv/spike.h   |   6 +-
 hw/riscv/spike.c   | 217 -
 3 files changed, 12 insertions(+), 228 deletions(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index f0061f94aa..50927bad74 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -379,13 +379,6 @@ This machine has been renamed ``fuloong2e``.
 These machine types are very old and likely can not be used for live migration
 from old QEMU versions anymore. A newer machine type should be used instead.
 
-``spike_v1.9.1`` and ``spike_v1.10`` (since 4.1)
-
-
-The version specific Spike machines have been deprecated in favour of the
-generic ``spike`` machine. If you need to specify an older version of the 
RISC-V
-spec you can use the ``-cpu rv64gcsu,priv_spec=v1.9.1`` command line argument.
-
 Device options
 --
 
@@ -493,6 +486,16 @@ The ``hub_id`` parameter of ``hostfwd_add`` / 
``hostfwd_remove`` (removed in 5.0
 The ``[hub_id name]`` parameter tuple of the 'hostfwd_add' and
 'hostfwd_remove' HMP commands has been replaced by ``netdev_id``.
 
+System emulator machines
+
+
+``spike_v1.9.1`` and ``spike_v1.10`` (removed in 5.1)
+'
+
+The version specific Spike machines have been removed in favour of the
+generic ``spike`` machine. If you need to specify an older version of the 
RISC-V
+spec you can use the ``-cpu rv64gcsu,priv_spec=v1.10.0`` command line argument.
+
 Related binaries
 
 
diff --git a/include/hw/riscv/spike.h b/include/hw/riscv/spike.h
index dc770421bc..1cd72b85d6 100644
--- a/include/hw/riscv/spike.h
+++ b/include/hw/riscv/spike.h
@@ -39,11 +39,9 @@ enum {
 };
 
 #if defined(TARGET_RISCV32)
-#define SPIKE_V1_09_1_CPU TYPE_RISCV_CPU_RV32GCSU_V1_09_1
-#define SPIKE_V1_10_0_CPU TYPE_RISCV_CPU_RV32GCSU_V1_10_0
+#define SPIKE_V1_10_0_CPU TYPE_RISCV_CPU_BASE32
 #elif defined(TARGET_RISCV64)
-#define SPIKE_V1_09_1_CPU TYPE_RISCV_CPU_RV64GCSU_V1_09_1
-#define SPIKE_V1_10_0_CPU TYPE_RISCV_CPU_RV64GCSU_V1_10_0
+#define SPIKE_V1_10_0_CPU TYPE_RISCV_CPU_BASE64
 #endif
 
 #endif
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index d0c4843712..7bbbdb5036 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -257,221 +257,6 @@ static void spike_board_init(MachineState *machine)
 false);
 }
 
-static void spike_v1_10_0_board_init(MachineState *machine)
-{
-const struct MemmapEntry *memmap = spike_memmap;
-
-SpikeState *s = g_new0(SpikeState, 1);
-MemoryRegion *system_memory = get_system_memory();
-MemoryRegion *main_mem = g_new(MemoryRegion, 1);
-MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
-int i;
-unsigned int smp_cpus = machine->smp.cpus;
-
-if (!qtest_enabled()) {
-info_report("The Spike v1.10.0 machine has been deprecated. "
-"Please use the generic spike machine and specify the ISA "
-"versions using -cpu.");
-}
-
-/* Initialize SOC */
-object_initialize_child(OBJECT(machine), "soc", >soc, sizeof(s->soc),
-TYPE_RISCV_HART_ARRAY, _abort, NULL);
-object_property_set_str(OBJECT(>soc), SPIKE_V1_10_0_CPU, "cpu-type",
-_abort);
-object_property_set_int(OBJECT(>soc), smp_cpus, "num-harts",
-_abort);
-object_property_set_bool(OBJECT(>soc), true, "realized",
-_abort);
-
-/* register system main memory (actual RAM) */
-memory_region_init_ram(main_mem, NULL, "riscv.spike.ram",
-   machine->ram_size, _fatal);
-memory_region_add_subregion(system_memory, memmap[SPIKE_DRAM].base,
-main_mem);
-
-/* create device tree */
-create_fdt(s, memmap, machine->ram_size, machine->kernel_cmdline);
-
-/* boot rom */
-memory_region_init_rom(mask_rom, NULL, "riscv.spike.mrom",
-   memmap[SPIKE_MROM].size, _fatal);
-memory_region_add_subregion(system_memory, memmap[SPIKE_MROM].base,
-mask_rom);
-
-if (machine->kernel_filename) {
-riscv_load_kernel(machine->kernel_filename, htif_symbol_callback);
-}
-
-/* reset vector */
-uint32_t reset_vec[8] = {
-0x0297,  /* 1:  auipc  t0, %pcrel_hi(dtb) */
-0x02028593,  /* addi   a1, t0, %pcrel_lo(1b) */
-0xf1402573,  /* csrr   a0, mhartid  */
-#if defined(TARGET_RISCV32)
-0x0182a283,  /* lw t0, 24(t0) */
-#elif defined(TARGET_RISCV64)
-0x0182b283,  /*

[PATCH v5 08/11] hw/intc: Initial commit of lowRISC Ibex PLIC

2020-05-28 Thread Alistair Francis

The Ibex core contains a PLIC that although similar to the RISC-V spec
is not RISC-V spec compliant.

This patch implements a Ibex PLIC in a somewhat generic way.

As the current RISC-V PLIC needs tidying up, my hope is that as the Ibex
PLIC move towards spec compliance this PLIC implementation can be
updated until it can replace the current PLIC.

Signed-off-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/intc/ibex_plic.h |  63 +
 hw/intc/ibex_plic.c | 261 
 MAINTAINERS |   2 +
 hw/intc/Makefile.objs   |   1 +
 4 files changed, 327 insertions(+)
 create mode 100644 include/hw/intc/ibex_plic.h
 create mode 100644 hw/intc/ibex_plic.c

diff --git a/include/hw/intc/ibex_plic.h b/include/hw/intc/ibex_plic.h
new file mode 100644
index 00..ddc7909903
--- /dev/null
+++ b/include/hw/intc/ibex_plic.h
@@ -0,0 +1,63 @@
+/*
+ * QEMU RISC-V lowRISC Ibex PLIC
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef HW_IBEX_PLIC_H
+#define HW_IBEX_PLIC_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_IBEX_PLIC "ibex-plic"
+#define IBEX_PLIC(obj) \
+OBJECT_CHECK(IbexPlicState, (obj), TYPE_IBEX_PLIC)
+
+typedef struct IbexPlicState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+MemoryRegion mmio;
+
+uint32_t *pending;
+uint32_t *source;
+uint32_t *priority;
+uint32_t *enable;
+uint32_t threshold;
+uint32_t claim;
+
+/* config */
+uint32_t num_cpus;
+uint32_t num_sources;
+
+uint32_t pending_base;
+uint32_t pending_num;
+
+uint32_t source_base;
+uint32_t source_num;
+
+uint32_t priority_base;
+uint32_t priority_num;
+
+uint32_t enable_base;
+uint32_t enable_num;
+
+uint32_t threshold_base;
+
+uint32_t claim_base;
+} IbexPlicState;
+
+#endif /* HW_IBEX_PLIC_H */
diff --git a/hw/intc/ibex_plic.c b/hw/intc/ibex_plic.c
new file mode 100644
index 00..41079518c6
--- /dev/null
+++ b/hw/intc/ibex_plic.c
@@ -0,0 +1,261 @@
+/*
+ * QEMU RISC-V lowRISC Ibex PLIC
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Documentation avaliable: https://docs.opentitan.org/hw/ip/rv_plic/doc/
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/qdev-properties.h"
+#include "hw/core/cpu.h"
+#include "hw/boards.h"
+#include "hw/pci/msi.h"
+#include "target/riscv/cpu_bits.h"
+#include "target/riscv/cpu.h"
+#include "hw/intc/ibex_plic.h"
+
+static bool addr_between(uint32_t addr, uint32_t base, uint32_t num)
+{
+uint32_t end = base + (num * 0x04);
+
+if (addr >= base && addr < end) {
+return true;
+}
+
+return false;
+}
+
+static void ibex_plic_irqs_set_pending(IbexPlicState *s, int irq, bool level)
+{
+int pending_num = irq / 32;
+
+s->pending[pending_num] |= level << (irq % 32);
+}
+
+static bool ibex_plic_irqs_pending(IbexPlicState *s, uint32_t context)
+{
+int i;
+
+for (i = 0; i < s->pending_num; i++) {
+uint32_t irq_num = ctz64(s->pending[i]) + (i * 32);
+
+if (!(s->pending[i] & s->enable[i])) {
+/* No pending and enabled IRQ */
+continue;
+}
+
+if (s->priority[irq_num] > s->threshold) {
+if (!s->claim) {
+s->claim = irq_num;
+}
+return true;
+}
+}
+
+return false;
+}
+
+static void ibex_plic_update(IbexPlicState *s)
+{
+CPUState *cpu;
+int level, i;
+
+for (i = 0; i < s->num_cpus; i++) {
+cpu = qemu_get_cpu(i);
+
+if (!cpu) {
+continue;
+}
+
+level = ibex_plic_irqs_pending(s, 0);
+
+riscv_cpu_update_mip(RISCV_CPU(cpu), MIP_MEIP, BOOL_TO_MASK(level));
+}
+}
+
+static

[PATCH v4 3/4] target/riscv: Drop support for ISA spec version 1.09.1

2020-05-28 Thread Alistair Francis

The RISC-V ISA spec version 1.09.1 has been deprecated in QEMU since
4.1. It's not commonly used so let's remove support for it.

Signed-off-by: Alistair Francis 
---
 docs/system/deprecated.rst|  20 +--
 target/riscv/cpu.h|   1 -
 target/riscv/cpu.c|   2 -
 target/riscv/cpu_helper.c |  82 ---
 target/riscv/csr.c| 138 --
 .../riscv/insn_trans/trans_privileged.inc.c   |  18 +--
 target/riscv/monitor.c|   5 -
 target/riscv/op_helper.c  |  17 +--
 8 files changed, 73 insertions(+), 210 deletions(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index a6664bfca9..38865daafc 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -301,16 +301,6 @@ The ``acl_show``, ``acl_reset``, ``acl_policy``, 
``acl_add``, and
 ``acl_remove`` commands are deprecated with no replacement. Authorization
 for VNC should be performed using the pluggable QAuthZ objects.
 
-Guest Emulator ISAs

-
-RISC-V ISA privledge specification version 1.09.1 (since 4.1)
-'
-
-The RISC-V ISA privledge specification version 1.09.1 has been deprecated.
-QEMU supports both the newer version 1.10.0 and the ratified version 1.11.0, 
these
-should be used instead of the 1.09.1 version.
-
 System emulator CPUS
 
 
@@ -471,6 +461,16 @@ The ``hub_id`` parameter of ``hostfwd_add`` / 
``hostfwd_remove`` (removed in 5.0
 The ``[hub_id name]`` parameter tuple of the 'hostfwd_add' and
 'hostfwd_remove' HMP commands has been replaced by ``netdev_id``.
 
+Guest Emulator ISAs
+---
+
+RISC-V ISA privledge specification version 1.09.1 (removed in 5.1)
+''
+
+The RISC-V ISA privledge specification version 1.09.1 has been removed.
+QEMU supports both the newer version 1.10.0 and the ratified version 1.11.0, 
these
+should be used instead of the 1.09.1 version.
+
 System emulator CPUS
 
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 76b98d7a33..c022539012 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -73,7 +73,6 @@ enum {
 RISCV_FEATURE_MISA
 };
 
-#define PRIV_VERSION_1_09_1 0x00010901
 #define PRIV_VERSION_1_10_0 0x00011000
 #define PRIV_VERSION_1_11_0 0x00011100
 
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 112f2e3a2f..eeb91f8513 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -368,8 +368,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 priv_version = PRIV_VERSION_1_11_0;
 } else if (!g_strcmp0(cpu->cfg.priv_spec, "v1.10.0")) {
 priv_version = PRIV_VERSION_1_10_0;
-} else if (!g_strcmp0(cpu->cfg.priv_spec, "v1.9.1")) {
-priv_version = PRIV_VERSION_1_09_1;
 } else {
 error_setg(errp,
"Unsupported privilege spec version '%s'",
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index bc80aa87cf..62fe1ecc8f 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -364,57 +364,36 @@ static int get_physical_address(CPURISCVState *env, 
hwaddr *physical,
 mxr = get_field(env->vsstatus, MSTATUS_MXR);
 }
 
-if (env->priv_ver >= PRIV_VERSION_1_10_0) {
-if (first_stage == true) {
-if (use_background) {
-base = (hwaddr)get_field(env->vsatp, SATP_PPN) << PGSHIFT;
-vm = get_field(env->vsatp, SATP_MODE);
-} else {
-base = (hwaddr)get_field(env->satp, SATP_PPN) << PGSHIFT;
-vm = get_field(env->satp, SATP_MODE);
-}
-widened = 0;
+if (first_stage == true) {
+if (use_background) {
+base = (hwaddr)get_field(env->vsatp, SATP_PPN) << PGSHIFT;
+vm = get_field(env->vsatp, SATP_MODE);
 } else {
-base = (hwaddr)get_field(env->hgatp, HGATP_PPN) << PGSHIFT;
-vm = get_field(env->hgatp, HGATP_MODE);
-widened = 2;
-}
-sum = get_field(env->mstatus, MSTATUS_SUM);
-switch (vm) {
-case VM_1_10_SV32:
-  levels = 2; ptidxbits = 10; ptesize = 4; break;
-case VM_1_10_SV39:
-  levels = 3; ptidxbits = 9; ptesize = 8; break;
-case VM_1_10_SV48:
-  levels = 4; ptidxbits = 9; ptesize = 8; break;
-case VM_1_10_SV57:
-  levels = 5; ptidxbits = 9; ptesize = 8; break;
-case VM_1_10_MBARE:
-*physical = addr;
-*prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
-return TRANSLATE_SUCCESS;
-default:
-  g_assert_not_reached();
+base = (hwaddr)get_field(env->satp, SATP_PPN) << PGSHIFT;
+vm = get_field(env->satp,

[PATCH v5 10/11] riscv/opentitan: Connect the UART device

2020-05-28 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/riscv/opentitan.h | 13 +
 hw/riscv/opentitan.c | 24 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 76f72905a8..8f29b9cbbf 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -21,6 +21,7 @@
 
 #include "hw/riscv/riscv_hart.h"
 #include "hw/intc/ibex_plic.h"
+#include "hw/char/ibex_uart.h"
 
 #define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
 #define RISCV_IBEX_SOC(obj) \
@@ -33,6 +34,7 @@ typedef struct LowRISCIbexSoCState {
 /*< public >*/
 RISCVHartArrayState cpus;
 IbexPlicState plic;
+IbexUartState uart;
 
 MemoryRegion flash_mem;
 MemoryRegion rom;
@@ -68,4 +70,15 @@ enum {
 IBEX_PADCTRL,
 };
 
+enum {
+IBEX_UART_RX_PARITY_ERR_IRQ = 0x28,
+IBEX_UART_RX_TIMEOUT_IRQ = 0x27,
+IBEX_UART_RX_BREAK_ERR_IRQ = 0x26,
+IBEX_UART_RX_FRAME_ERR_IRQ = 0x25,
+IBEX_UART_RX_OVERFLOW_IRQ = 0x24,
+IBEX_UART_TX_EMPTY_IRQ = 0x23,
+IBEX_UART_RX_WATERMARK_IRQ = 0x22,
+IBEX_UART_TX_WATERMARK_IRQ = 0x21,
+};
+
 #endif
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index 46a3a93c5e..a8844a870b 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -101,6 +101,9 @@ static void riscv_lowrisc_ibex_soc_init(Object *obj)
 
 sysbus_init_child_obj(obj, "plic", >plic,
   sizeof(s->plic), TYPE_IBEX_PLIC);
+
+sysbus_init_child_obj(obj, "uart", >uart,
+  sizeof(s->uart), TYPE_IBEX_UART);
 }
 
 static void riscv_lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp)
@@ -142,8 +145,25 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 busdev = SYS_BUS_DEVICE(dev);
 sysbus_mmio_map(busdev, 0, memmap[IBEX_PLIC].base);
 
-create_unimplemented_device("riscv.lowrisc.ibex.uart",
-memmap[IBEX_UART].base, memmap[IBEX_UART].size);
+/* UART */
+dev = DEVICE(&(s->uart));
+qdev_prop_set_chr(dev, "chardev", serial_hd(0));
+object_property_set_bool(OBJECT(>uart), true, "realized", );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+busdev = SYS_BUS_DEVICE(dev);
+sysbus_mmio_map(busdev, 0, memmap[IBEX_UART].base);
+sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(DEVICE(>plic),
+   IBEX_UART_TX_WATERMARK_IRQ));
+sysbus_connect_irq(busdev, 1, qdev_get_gpio_in(DEVICE(>plic),
+   IBEX_UART_RX_WATERMARK_IRQ));
+sysbus_connect_irq(busdev, 2, qdev_get_gpio_in(DEVICE(>plic),
+   IBEX_UART_TX_EMPTY_IRQ));
+sysbus_connect_irq(busdev, 3, qdev_get_gpio_in(DEVICE(>plic),
+   IBEX_UART_RX_OVERFLOW_IRQ));
+
 create_unimplemented_device("riscv.lowrisc.ibex.gpio",
 memmap[IBEX_GPIO].base, memmap[IBEX_GPIO].size);
 create_unimplemented_device("riscv.lowrisc.ibex.spi",
-- 
2.26.2

[PATCH v5 06/11] riscv: Initial commit of OpenTitan machine

2020-05-28 Thread Alistair Francis

This adds a barebone OpenTitan machine to QEMU.

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 default-configs/riscv32-softmmu.mak |   1 +
 default-configs/riscv64-softmmu.mak |  11 +-
 include/hw/riscv/opentitan.h|  68 ++
 hw/riscv/opentitan.c| 184 
 MAINTAINERS |   9 ++
 hw/riscv/Kconfig|   5 +
 hw/riscv/Makefile.objs  |   1 +
 7 files changed, 278 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/riscv/opentitan.h
 create mode 100644 hw/riscv/opentitan.c

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index 1ae077ed87..94a236c9c2 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -10,3 +10,4 @@ CONFIG_SPIKE=y
 CONFIG_SIFIVE_E=y
 CONFIG_SIFIVE_U=y
 CONFIG_RISCV_VIRT=y
+CONFIG_OPENTITAN=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index 235c6f473f..aaf6d735bb 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -1,3 +1,12 @@
 # Default configuration for riscv64-softmmu
 
-include riscv32-softmmu.mak
+# Uncomment the following lines to disable these optional devices:
+#
+#CONFIG_PCI_DEVICES=n
+
+# Boards:
+#
+CONFIG_SPIKE=y
+CONFIG_SIFIVE_E=y
+CONFIG_SIFIVE_U=y
+CONFIG_RISCV_VIRT=y
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
new file mode 100644
index 00..a4b6499444
--- /dev/null
+++ b/include/hw/riscv/opentitan.h
@@ -0,0 +1,68 @@
+/*
+ * QEMU RISC-V Board Compatible with OpenTitan FPGA platform
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef HW_OPENTITAN_H
+#define HW_OPENTITAN_H
+
+#include "hw/riscv/riscv_hart.h"
+
+#define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
+#define RISCV_IBEX_SOC(obj) \
+OBJECT_CHECK(LowRISCIbexSoCState, (obj), TYPE_RISCV_IBEX_SOC)
+
+typedef struct LowRISCIbexSoCState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+RISCVHartArrayState cpus;
+MemoryRegion flash_mem;
+MemoryRegion rom;
+} LowRISCIbexSoCState;
+
+typedef struct OpenTitanState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+LowRISCIbexSoCState soc;
+} OpenTitanState;
+
+enum {
+IBEX_ROM,
+IBEX_RAM,
+IBEX_FLASH,
+IBEX_UART,
+IBEX_GPIO,
+IBEX_SPI,
+IBEX_FLASH_CTRL,
+IBEX_RV_TIMER,
+IBEX_AES,
+IBEX_HMAC,
+IBEX_PLIC,
+IBEX_PWRMGR,
+IBEX_RSTMGR,
+IBEX_CLKMGR,
+IBEX_PINMUX,
+IBEX_ALERT_HANDLER,
+IBEX_NMI_GEN,
+IBEX_USBDEV,
+IBEX_PADCTRL,
+};
+
+#endif
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
new file mode 100644
index 00..b4fb836466
--- /dev/null
+++ b/hw/riscv/opentitan.c
@@ -0,0 +1,184 @@
+/*
+ * QEMU RISC-V Board Compatible with OpenTitan FPGA platform
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Provides a board compatible with the OpenTitan FPGA platform:
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/riscv/opentitan.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/misc/unimp.h"
+#include "hw/riscv/boot.h"
+#include "exec/address-spaces.h"
+
+static const struct MemmapEntry {
+hwaddr base;
+hwaddr size;
+} ibex_memmap[] = {
+[IBEX_ROM] ={  0x8000,   0xc000 },
+[IBEX_RAM] ={  0x1000,  0x1 },
+[IBEX_FLASH] =  {  0x2000,  0x8 },
+[IBEX_UART] =   {  0x4000,  0x1 },
+[IBEX_GPIO] =   {  0x4001,  0x1 },
+[IBEX_SPI] ={  0x4002,  0x1 },
+[IBEX_FLASH_CTRL] = {  0x4003,  0x1 },
+[IBEX_PINMUX] = {

[PATCH v5 05/11] target/riscv: Add the lowRISC Ibex CPU

2020-05-28 Thread Alistair Francis

Ibex is a small and efficient, 32-bit, in-order RISC-V core with
a 2-stage pipeline that implements the RV32IMC instruction set
architecture.

For more details on lowRISC see here:
https://github.com/lowRISC/ibex

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: LIU Zhiwei 
---
 target/riscv/cpu.h |  1 +
 target/riscv/cpu.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d0e7f5b9c5..8733d7467f 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -35,6 +35,7 @@
 #define TYPE_RISCV_CPU_ANY  RISCV_CPU_TYPE_NAME("any")
 #define TYPE_RISCV_CPU_BASE32   RISCV_CPU_TYPE_NAME("rv32")
 #define TYPE_RISCV_CPU_BASE64   RISCV_CPU_TYPE_NAME("rv64")
+#define TYPE_RISCV_CPU_IBEX RISCV_CPU_TYPE_NAME("lowrisc-ibex")
 #define TYPE_RISCV_CPU_SIFIVE_E31   RISCV_CPU_TYPE_NAME("sifive-e31")
 #define TYPE_RISCV_CPU_SIFIVE_E34   RISCV_CPU_TYPE_NAME("sifive-e34")
 #define TYPE_RISCV_CPU_SIFIVE_E51   RISCV_CPU_TYPE_NAME("sifive-e51")
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 406e8f37d7..6e0d4d1dda 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -152,6 +152,15 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_resetvec(env, DEFAULT_RSTVEC);
 }
 
+static void rv32imcu_nommu_cpu_init(Object *obj)
+{
+CPURISCVState *env = _CPU(obj)->env;
+set_misa(env, RV32 | RVI | RVM | RVC | RVU);
+set_priv_version(env, PRIV_VERSION_1_10_0);
+set_resetvec(env, 0x8090);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
+}
+
 static void rv32imacu_nommu_cpu_init(Object *obj)
 {
 CPURISCVState *env = _CPU(obj)->env;
@@ -611,6 +620,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
 DEFINE_CPU(TYPE_RISCV_CPU_ANY,  riscv_any_cpu_init),
 #if defined(TARGET_RISCV32)
 DEFINE_CPU(TYPE_RISCV_CPU_BASE32,   riscv_base32_cpu_init),
+DEFINE_CPU(TYPE_RISCV_CPU_IBEX, rv32imcu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E31,   rv32imacu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E34,   rv32imafcu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U34,   rv32gcsu_priv1_10_0_cpu_init),
-- 
2.26.2

[PATCH v5 11/11] target/riscv: Use a smaller guess size for no-MMU PMP

2020-05-28 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 target/riscv/pmp.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
index 0e6b640fbd..9418660f1b 100644
--- a/target/riscv/pmp.c
+++ b/target/riscv/pmp.c
@@ -233,12 +233,16 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong 
addr,
 return true;
 }
 
-/*
- * if size is unknown (0), assume that all bytes
- * from addr to the end of the page will be accessed.
- */
 if (size == 0) {
-pmp_size = -(addr | TARGET_PAGE_MASK);
+if (riscv_feature(env, RISCV_FEATURE_MMU)) {
+/*
+ * If size is unknown (0), assume that all bytes
+ * from addr to the end of the page will be accessed.
+ */
+pmp_size = -(addr | TARGET_PAGE_MASK);
+} else {
+pmp_size = sizeof(target_ulong);
+}
 } else {
 pmp_size = size;
 }
-- 
2.26.2

[PATCH v5 07/11] hw/char: Initial commit of Ibex UART

2020-05-28 Thread Alistair Francis

This is the initial commit of the Ibex UART device. Serial TX is
working, while RX has been implemeneted but untested.

This is based on the documentation from:
https://docs.opentitan.org/hw/ip/uart/doc/

Signed-off-by: Alistair Francis 
---
 include/hw/char/ibex_uart.h | 110 
 hw/char/ibex_uart.c | 492 
 MAINTAINERS |   2 +
 hw/char/Makefile.objs   |   1 +
 hw/riscv/Kconfig|   4 +
 5 files changed, 609 insertions(+)
 create mode 100644 include/hw/char/ibex_uart.h
 create mode 100644 hw/char/ibex_uart.c

diff --git a/include/hw/char/ibex_uart.h b/include/hw/char/ibex_uart.h
new file mode 100644
index 00..2bec772615
--- /dev/null
+++ b/include/hw/char/ibex_uart.h
@@ -0,0 +1,110 @@
+/*
+ * QEMU lowRISC Ibex UART device
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef HW_IBEX_UART_H
+#define HW_IBEX_UART_H
+
+#include "hw/sysbus.h"
+#include "chardev/char-fe.h"
+#include "qemu/timer.h"
+
+#define IBEX_UART_INTR_STATE   0x00
+#define INTR_STATE_TX_WATERMARK (1 << 0)
+#define INTR_STATE_RX_WATERMARK (1 << 1)
+#define INTR_STATE_TX_EMPTY (1 << 2)
+#define INTR_STATE_RX_OVERFLOW  (1 << 3)
+#define IBEX_UART_INTR_ENABLE  0x04
+#define IBEX_UART_INTR_TEST0x08
+
+#define IBEX_UART_CTRL 0x0c
+#define UART_CTRL_TX_ENABLE (1 << 0)
+#define UART_CTRL_RX_ENABLE (1 << 1)
+#define UART_CTRL_NF(1 << 2)
+#define UART_CTRL_SLPBK (1 << 4)
+#define UART_CTRL_LLPBK (1 << 5)
+#define UART_CTRL_PARITY_EN (1 << 6)
+#define UART_CTRL_PARITY_ODD(1 << 7)
+#define UART_CTRL_RXBLVL(3 << 8)
+#define UART_CTRL_NCO   (0x << 16)
+
+#define IBEX_UART_STATUS   0x10
+#define UART_STATUS_TXFULL  (1 << 0)
+#define UART_STATUS_RXFULL  (1 << 1)
+#define UART_STATUS_TXEMPTY (1 << 2)
+#define UART_STATUS_RXIDLE  (1 << 4)
+#define UART_STATUS_RXEMPTY (1 << 5)
+
+#define IBEX_UART_RDATA0x14
+#define IBEX_UART_WDATA0x18
+
+#define IBEX_UART_FIFO_CTRL0x1c
+#define FIFO_CTRL_RXRST  (1 << 0)
+#define FIFO_CTRL_TXRST  (1 << 1)
+#define FIFO_CTRL_RXILVL (7 << 2)
+#define FIFO_CTRL_RXILVL_SHIFT   (2)
+#define FIFO_CTRL_TXILVL (3 << 5)
+#define FIFO_CTRL_TXILVL_SHIFT   (5)
+
+#define IBEX_UART_FIFO_STATUS  0x20
+#define IBEX_UART_OVRD 0x24
+#define IBEX_UART_VAL  0x28
+#define IBEX_UART_TIMEOUT_CTRL 0x2c
+
+#define IBEX_UART_TX_FIFO_SIZE 16
+
+#define TYPE_IBEX_UART "ibex-uart"
+#define IBEX_UART(obj) \
+OBJECT_CHECK(IbexUartState, (obj), TYPE_IBEX_UART)
+
+typedef struct {
+/*  */
+SysBusDevice parent_obj;
+
+/*  */
+MemoryRegion mmio;
+
+uint8_t tx_fifo[IBEX_UART_TX_FIFO_SIZE];
+uint32_t tx_level;
+
+QEMUTimer *fifo_trigger_handle;
+uint64_t char_tx_time;
+
+uint32_t uart_intr_state;
+uint32_t uart_intr_enable;
+uint32_t uart_ctrl;
+uint32_t uart_status;
+uint32_t uart_rdata;
+uint32_t uart_fifo_ctrl;
+uint32_t uart_fifo_status;
+uint32_t uart_ovrd;
+uint32_t uart_val;
+uint32_t uart_timeout_ctrl;
+
+CharBackend chr;
+qemu_irq tx_watermark;
+qemu_irq rx_watermark;
+qemu_irq tx_empty;
+qemu_irq rx_overflow;
+} IbexUartState;
+#endif /* HW_IBEX_UART_H */
diff --git a/hw/char/ibex_uart.c b/hw/char/ibex_uart.c
new file mode 100644
index 00..c416325d73
--- /dev/null
+++ b/hw/char/ibex_uart.c
@@ -0,0 +1,492 @@
+/*
+ * QEMU lowRISC Ibex UART device
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * For details check the documentation here:
+ *https://docs.opentitan.org/hw/ip/uart/doc/
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"),

[PATCH v5 04/11] target/riscv: Don't set PMP feature in the cpu init

2020-05-28 Thread Alistair Francis

The PMP is enabled by default via the "pmp" property so there is no need
for us to set it in the init function. As all CPUs have PMP support just
remove the set_feature() call in the CPU init functions.

Signed-off-by: Alistair Francis 
---
 target/riscv/cpu.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8deba3d16d..406e8f37d7 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -142,7 +142,6 @@ static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
@@ -151,7 +150,6 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv32imacu_nommu_cpu_init(Object *obj)
@@ -160,7 +158,6 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVC | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
@@ -170,7 +167,6 @@ static void rv32imafcu_nommu_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVC | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
@@ -190,7 +186,6 @@ static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
@@ -199,7 +194,6 @@ static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 }
 
 static void rv64imacu_nommu_cpu_init(Object *obj)
@@ -208,7 +202,6 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVC | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_PMP);
 qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
-- 
2.26.2

[PATCH v5 03/11] target/riscv: Disable the MMU correctly

2020-05-28 Thread Alistair Francis

Previously if we didn't enable the MMU it would be enabled in the
realize() function anyway. Let's ensure that if we don't want the MMU we
disable it. We also don't need to enable the MMU as it will be enalbed
in realize() by default.

Signed-off-by: Alistair Francis 
---
 target/riscv/cpu.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5eb3c02735..8deba3d16d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -142,7 +142,6 @@ static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -152,7 +151,6 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -163,6 +161,7 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 static void rv32imafcu_nommu_cpu_init(Object *obj)
@@ -172,6 +171,7 @@ static void rv32imafcu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 #elif defined(TARGET_RISCV64)
@@ -190,7 +190,6 @@ static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -200,7 +199,6 @@ static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -211,6 +209,7 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 #endif
-- 
2.26.2

[PATCH v5 01/11] riscv/boot: Add a missing header include

2020-05-28 Thread Alistair Francis

As the functions declared in this header use the symbol_fn_t
typedef itself declared in "hw/loader.h", we need to include
it here to make the header file self-contained.

Signed-off-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bin Meng 
---
 include/hw/riscv/boot.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
index 474a940ad5..9daa98da08 100644
--- a/include/hw/riscv/boot.h
+++ b/include/hw/riscv/boot.h
@@ -21,6 +21,7 @@
 #define RISCV_BOOT_H
 
 #include "exec/cpu-defs.h"
+#include "hw/loader.h"
 
 void riscv_find_and_load_firmware(MachineState *machine,
   const char *default_machine_firmware,
-- 
2.26.2

[PATCH v5 00/11] RISC-V Add the OpenTitan Machine

2020-05-28 Thread Alistair Francis

OpenTitan is an open source silicon Root of Trust (RoT) project. This
series adds initial support for the OpenTitan machine to QEMU.

This series add the Ibex CPU to the QEMU RISC-V target. It then adds the
OpenTitan machine, the Ibex UART and the Ibex PLIC.

The UART has been tested sending and receiving data.

With this series QEMU can boot the OpenTitan ROM, Tock OS and a Tock
userspace app.

The Ibex PLIC is similar to the RISC-V PLIC (and is based on the QEMU
implementation) with some differences. The hope is that the Ibex PLIC
will converge to follow the RISC-V spec. As that happens I want to
update the QEMU Ibex PLIC and hopefully eventually replace the current
PLIC as the implementation is a little overlay complex.

For more details on OpenTitan, see here: https://docs.opentitan.org/

v5:
 - Add some of the missing unimplemented devices
 - Don't set PMP feature in init() function
v4:
 - Don't set the reset vector in realise
 - Fix a bug where the MMU is always enabled
 - Fixup the PMP/MMU size logic
v3:
 - Small fixes pointed out in review
v2:
 - Rebase on master
 - Get uart receive working



Alistair Francis (11):
  riscv/boot: Add a missing header include
  target/riscv: Don't overwrite the reset vector
  target/riscv: Disable the MMU correctly
  target/riscv: Don't set PMP feature in the cpu init
  target/riscv: Add the lowRISC Ibex CPU
  riscv: Initial commit of OpenTitan machine
  hw/char: Initial commit of Ibex UART
  hw/intc: Initial commit of lowRISC Ibex PLIC
  riscv/opentitan: Connect the PLIC device
  riscv/opentitan: Connect the UART device
  target/riscv: Use a smaller guess size for no-MMU PMP

 default-configs/riscv32-softmmu.mak |   1 +
 default-configs/riscv64-softmmu.mak |  11 +-
 include/hw/char/ibex_uart.h | 110 +++
 include/hw/intc/ibex_plic.h |  63 
 include/hw/riscv/boot.h |   1 +
 include/hw/riscv/opentitan.h|  84 +
 target/riscv/cpu.h  |   1 +
 hw/char/ibex_uart.c | 492 
 hw/intc/ibex_plic.c | 261 +++
 hw/riscv/opentitan.c| 219 +
 target/riscv/cpu.c  |  27 +-
 target/riscv/pmp.c  |  14 +-
 MAINTAINERS |  13 +
 hw/char/Makefile.objs   |   1 +
 hw/intc/Makefile.objs   |   1 +
 hw/riscv/Kconfig|   9 +
 hw/riscv/Makefile.objs  |   1 +
 17 files changed, 1291 insertions(+), 18 deletions(-)
 create mode 100644 include/hw/char/ibex_uart.h
 create mode 100644 include/hw/intc/ibex_plic.h
 create mode 100644 include/hw/riscv/opentitan.h
 create mode 100644 hw/char/ibex_uart.c
 create mode 100644 hw/intc/ibex_plic.c
 create mode 100644 hw/riscv/opentitan.c

-- 
2.26.2

[PATCH 3/4] python/qemu: delint and add pylintrc

2020-05-28 Thread John Snow

Bring our these files up to speed with pylint 2.5.0.
Add a pylintrc file to formalize which pylint subset
we are targeting.

The similarity ignore is there to suppress similarity
reports across imports, which for typing constants,
are going to trigger this report erroneously.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
---
 python/qemu/machine.py |  6 ++---
 python/qemu/pylintrc   | 58 ++
 python/qemu/qtest.py   | 42 +++---
 3 files changed, 88 insertions(+), 18 deletions(-)
 create mode 100644 python/qemu/pylintrc

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index e3ea5235713..c79fc8fb89a 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -58,7 +58,7 @@ def __init__(self, reply):
 self.reply = reply
 
 
-class QEMUMachine(object):
+class QEMUMachine:
 """
 A QEMU VM
 
@@ -242,7 +242,7 @@ def _base_args(self):
  'chardev=mon,mode=control'])
 if self._machine is not None:
 args.extend(['-machine', self._machine])
-for i in range(self._console_index):
+for _ in range(self._console_index):
 args.extend(['-serial', 'null'])
 if self._console_set:
 self._console_address = os.path.join(self._sock_dir,
@@ -383,7 +383,7 @@ def shutdown(self, has_quit: bool = False) -> None:
 command = ' '.join(self._qemu_full_args)
 else:
 command = ''
-LOG.warning(msg, -exitcode, command)
+LOG.warning(msg, -int(exitcode), command)
 
 self._launched = False
 
diff --git a/python/qemu/pylintrc b/python/qemu/pylintrc
new file mode 100644
index 000..5d6ae7367d8
--- /dev/null
+++ b/python/qemu/pylintrc
@@ -0,0 +1,58 @@
+[MASTER]
+
+[MESSAGES CONTROL]
+
+# Disable the message, report, category or checker with the given id(s). You
+# can either give multiple identifiers separated by comma (,) or put this
+# option multiple times (only on the command line, not in the configuration
+# file where it should appear only once). You can also use "--disable=all" to
+# disable everything first and then reenable specific checks. For example, if
+# you want to run only the similarities checker, you can use "--disable=all
+# --enable=similarities". If you want to run only the classes checker, but have
+# no Warning level messages displayed, use "--disable=all --enable=classes
+# --disable=W".
+disable=too-many-arguments,
+too-many-instance-attributes,
+too-many-public-methods,
+
+[REPORTS]
+
+[REFACTORING]
+
+[MISCELLANEOUS]
+
+[LOGGING]
+
+[BASIC]
+
+# Good variable names which should always be accepted, separated by a comma.
+good-names=i,
+   j,
+   k,
+   ex,
+   Run,
+   _,
+   fd,
+
+[VARIABLES]
+
+[STRING]
+
+[SPELLING]
+
+[FORMAT]
+
+[SIMILARITIES]
+
+# Ignore imports when computing similarities.
+ignore-imports=yes
+
+[TYPECHECK]
+
+[CLASSES]
+
+[IMPORTS]
+
+[DESIGN]
+
+[EXCEPTIONS]
diff --git a/python/qemu/qtest.py b/python/qemu/qtest.py
index d24ad04256b..53d814c0641 100644
--- a/python/qemu/qtest.py
+++ b/python/qemu/qtest.py
@@ -1,5 +1,11 @@
-# QEMU qtest library
-#
+"""
+QEMU qtest library
+
+qtest offers the QEMUQtestProtocol and QEMUQTestMachine classes, which
+offer a connection to QEMU's qtest protocol socket, and a qtest-enabled
+subclass of QEMUMachine, respectively.
+"""
+
 # Copyright (C) 2015 Red Hat Inc.
 #
 # Authors:
@@ -17,19 +23,21 @@
 from .machine import QEMUMachine
 
 
-class QEMUQtestProtocol(object):
+class QEMUQtestProtocol:
+"""
+QEMUQtestProtocol implements a connection to a qtest socket.
+
+:param address: QEMU address, can be either a unix socket path (string)
+or a tuple in the form ( address, port ) for a TCP
+connection
+:param server: server mode, listens on the socket (bool)
+:raise socket.error: on socket connection errors
+
+.. note::
+   No conection is estabalished by __init__(), this is done
+   by the connect() or accept() methods.
+"""
 def __init__(self, address, server=False):
-"""
-Create a QEMUQtestProtocol object.
-
-@param address: QEMU address, can be either a unix socket path (string)
-or a tuple in the form ( address, port ) for a TCP
-connection
-@param server: server mode, listens on the socket (bool)
-@raise socket.error on socket connection errors
-@note No connection is established, this is done by the connect() or
-  accept() methods
-"""
 self._address = address
 self._sock = self._get_sock()
 self._sockfile = None
@@ -73,15 +81,19 @@ def cmd(self, qtest_cmd):
 return resp
 
 def close(self):
+"""Close this socket."""
 self._sock.close()
 self._sockfile.close()

[PATCH 4/4] python/qemu: delint; add flake8 config

2020-05-28 Thread John Snow

Mostly, ignore the "no bare except" rule, because flake8 is not
contextual and cannot determine if we re-raise. Pylint can, though, so
always prefer pylint for that.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
---
 python/qemu/.flake8|  2 ++
 python/qemu/accel.py   |  9 ++---
 python/qemu/machine.py | 13 +
 python/qemu/qmp.py |  4 ++--
 4 files changed, 19 insertions(+), 9 deletions(-)
 create mode 100644 python/qemu/.flake8

diff --git a/python/qemu/.flake8 b/python/qemu/.flake8
new file mode 100644
index 000..45d8146f3f5
--- /dev/null
+++ b/python/qemu/.flake8
@@ -0,0 +1,2 @@
+[flake8]
+extend-ignore = E722  # Pylint handles this, but smarter.
\ No newline at end of file
diff --git a/python/qemu/accel.py b/python/qemu/accel.py
index 36ae85791ee..7fabe629208 100644
--- a/python/qemu/accel.py
+++ b/python/qemu/accel.py
@@ -23,11 +23,12 @@
 # Mapping host architecture to any additional architectures it can
 # support which often includes its 32 bit cousin.
 ADDITIONAL_ARCHES = {
-"x86_64" : "i386",
-"aarch64" : "armhf",
-"ppc64le" : "ppc64",
+"x86_64": "i386",
+"aarch64": "armhf",
+"ppc64le": "ppc64",
 }
 
+
 def list_accel(qemu_bin):
 """
 List accelerators enabled in the QEMU binary.
@@ -47,6 +48,7 @@ def list_accel(qemu_bin):
 # Skip the first line which is the header.
 return [acc.strip() for acc in out.splitlines()[1:]]
 
+
 def kvm_available(target_arch=None, qemu_bin=None):
 """
 Check if KVM is available using the following heuristic:
@@ -69,6 +71,7 @@ def kvm_available(target_arch=None, qemu_bin=None):
 return False
 return True
 
+
 def tcg_available(qemu_bin):
 """
 Check if TCG is available.
diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index c79fc8fb89a..4b260fa2cb2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -29,6 +29,7 @@
 
 LOG = logging.getLogger(__name__)
 
+
 class QEMUMachineError(Exception):
 """
 Exception called when an error in QEMUMachine happens.
@@ -62,7 +63,8 @@ class QEMUMachine:
 """
 A QEMU VM
 
-Use this object as a context manager to ensure the QEMU process 
terminates::
+Use this object as a context manager to ensure
+the QEMU process terminates::
 
 with VM(binary) as vm:
 ...
@@ -188,8 +190,10 @@ def send_fd_scm(self, fd=None, file_path=None):
 fd_param.append(str(fd))
 
 devnull = open(os.path.devnull, 'rb')
-proc = subprocess.Popen(fd_param, stdin=devnull, 
stdout=subprocess.PIPE,
-stderr=subprocess.STDOUT, close_fds=False)
+proc = subprocess.Popen(
+fd_param, stdin=devnull, stdout=subprocess.PIPE,
+stderr=subprocess.STDOUT, close_fds=False
+)
 output = proc.communicate()[0]
 if output:
 LOG.debug(output)
@@ -491,7 +495,8 @@ def event_wait(self, name, timeout=60.0, match=None):
 
 def events_wait(self, events, timeout=60.0):
 """
-events_wait waits for and returns a named event from QMP with a 
timeout.
+events_wait waits for and returns a named event
+from QMP with a timeout.
 
 events: a sequence of (name, match_criteria) tuples.
 The match criteria are optional and may be None.
diff --git a/python/qemu/qmp.py b/python/qemu/qmp.py
index d6c9b2f4b12..6ae7693965a 100644
--- a/python/qemu/qmp.py
+++ b/python/qemu/qmp.py
@@ -168,8 +168,8 @@ def accept(self, timeout=15.0):
 
 @param timeout: timeout in seconds (nonnegative float number, or
 None). The value passed will set the behavior of the
-underneath QMP socket as described in [1]. Default 
value
-is set to 15.0.
+underneath QMP socket as described in [1].
+Default value is set to 15.0.
 @return QMP greeting dict
 @raise OSError on socket connection errors
 @raise QMPConnectError if the greeting is not received
-- 
2.21.3

[PATCH v5 02/11] target/riscv: Don't overwrite the reset vector

2020-05-28 Thread Alistair Francis

The reset vector is set in the init function don't set it again in
realize.

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 target/riscv/cpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 059d71f2c7..5eb3c02735 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -133,6 +133,7 @@ static void riscv_base32_cpu_init(Object *obj)
 CPURISCVState *env = _CPU(obj)->env;
 /* We set this in the realise function */
 set_misa(env, 0);
+set_resetvec(env, DEFAULT_RSTVEC);
 }
 
 static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
@@ -180,6 +181,7 @@ static void riscv_base64_cpu_init(Object *obj)
 CPURISCVState *env = _CPU(obj)->env;
 /* We set this in the realise function */
 set_misa(env, 0);
+set_resetvec(env, DEFAULT_RSTVEC);
 }
 
 static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
@@ -399,7 +401,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 }
 
 set_priv_version(env, priv_version);
-set_resetvec(env, DEFAULT_RSTVEC);
 
 if (cpu->cfg.mmu) {
 set_feature(env, RISCV_FEATURE_MMU);
-- 
2.26.2

[PATCH 0/4] python: pylint and flake8 support

2020-05-28 Thread John Snow

This is a quick series to delint the files under python/qemu, with one
extra fix outside of that domain.

This was split out from my longer series attempting to package
python/qemu. This part is a nice standalone chunk.

John Snow (4):
  scripts/qmp: Fix shebang and imports
  python/machine.py: remove bare except
  python/qemu: delint and add pylintrc
  python/qemu: delint; add flake8 config

 python/qemu/.flake8|  2 ++
 python/qemu/accel.py   |  9 ---
 python/qemu/machine.py | 52 +++--
 python/qemu/pylintrc   | 58 ++
 python/qemu/qmp.py |  4 +--
 python/qemu/qtest.py   | 42 +++---
 scripts/qmp/qmp|  4 ++-
 scripts/qmp/qom-fuse   |  4 ++-
 scripts/qmp/qom-get|  6 +++--
 scripts/qmp/qom-list   |  6 +++--
 scripts/qmp/qom-set|  6 +++--
 scripts/qmp/qom-tree   |  6 +++--
 12 files changed, 150 insertions(+), 49 deletions(-)
 create mode 100644 python/qemu/.flake8
 create mode 100644 python/qemu/pylintrc

-- 
2.21.3

[PATCH 1/4] scripts/qmp: Fix shebang and imports

2020-05-28 Thread John Snow

There's more wrong with these scripts; They are in various stages of
disrepair. That's beyond the scope of this current patchset.

This just mechanically corrects the imports and the shebangs, as part of
ensuring that the python/qemu/lib refactoring didn't break anything
needlessly.

Signed-off-by: John Snow 
---
 scripts/qmp/qmp  | 4 +++-
 scripts/qmp/qom-fuse | 4 +++-
 scripts/qmp/qom-get  | 6 --
 scripts/qmp/qom-list | 6 --
 scripts/qmp/qom-set  | 6 --
 scripts/qmp/qom-tree | 6 --
 6 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/scripts/qmp/qmp b/scripts/qmp/qmp
index 0625fc2abac..8e52e4a54de 100755
--- a/scripts/qmp/qmp
+++ b/scripts/qmp/qmp
@@ -11,7 +11,9 @@
 # See the COPYING file in the top-level directory.
 
 import sys, os
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 def print_response(rsp, prefix=[]):
 if type(rsp) == list:
diff --git a/scripts/qmp/qom-fuse b/scripts/qmp/qom-fuse
index 6bada2c33d3..5fa6b3bf64d 100755
--- a/scripts/qmp/qom-fuse
+++ b/scripts/qmp/qom-fuse
@@ -15,7 +15,9 @@ import fuse, stat
 from fuse import Fuse
 import os, posix
 from errno import *
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 fuse.fuse_python_api = (0, 2)
 
diff --git a/scripts/qmp/qom-get b/scripts/qmp/qom-get
index 007b4cd442e..666df718320 100755
--- a/scripts/qmp/qom-get
+++ b/scripts/qmp/qom-get
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python3
 ##
 # QEMU Object Model test tools
 #
@@ -13,7 +13,9 @@
 
 import sys
 import os
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 cmd, args = sys.argv[0], sys.argv[1:]
 socket_path = None
diff --git a/scripts/qmp/qom-list b/scripts/qmp/qom-list
index 03bda3446b7..5074fd939f4 100755
--- a/scripts/qmp/qom-list
+++ b/scripts/qmp/qom-list
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python3
 ##
 # QEMU Object Model test tools
 #
@@ -13,7 +13,9 @@
 
 import sys
 import os
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 cmd, args = sys.argv[0], sys.argv[1:]
 socket_path = None
diff --git a/scripts/qmp/qom-set b/scripts/qmp/qom-set
index c37fe78b000..240a78187f9 100755
--- a/scripts/qmp/qom-set
+++ b/scripts/qmp/qom-set
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python3
 ##
 # QEMU Object Model test tools
 #
@@ -13,7 +13,9 @@
 
 import sys
 import os
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 cmd, args = sys.argv[0], sys.argv[1:]
 socket_path = None
diff --git a/scripts/qmp/qom-tree b/scripts/qmp/qom-tree
index 1c8acf61e79..25b0781323c 100755
--- a/scripts/qmp/qom-tree
+++ b/scripts/qmp/qom-tree
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python3
 ##
 # QEMU Object Model test tools
 #
@@ -15,7 +15,9 @@
 
 import sys
 import os
-from qmp import QEMUMonitorProtocol
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from qemu.qmp import QEMUMonitorProtocol
 
 cmd, args = sys.argv[0], sys.argv[1:]
 socket_path = None
-- 
2.21.3

[PATCH 2/4] python/machine.py: remove bare except

2020-05-28 Thread John Snow

Catch only the timeout error; if there are other problems, allow the
stack trace to be visible.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
---
 python/qemu/machine.py | 33 +
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index b9a98e2c862..e3ea5235713 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -342,7 +342,26 @@ def wait(self):
 self._load_io_log()
 self._post_shutdown()
 
-def shutdown(self, has_quit=False):
+def _issue_shutdown(self, has_quit: bool = False) -> None:
+"""
+Shutdown the VM.
+"""
+if not self.is_running():
+return
+
+if self._qmp is not None:
+if not has_quit:
+self._qmp.cmd('quit')
+self._qmp.close()
+
+try:
+self._popen.wait(timeout=3)
+except subprocess.TimeoutExpired:
+self._popen.kill()
+
+self._popen.wait()
+
+def shutdown(self, has_quit: bool = False) -> None:
 """
 Terminate the VM and clean up
 """
@@ -353,17 +372,7 @@ def shutdown(self, has_quit=False):
 self._console_socket.close()
 self._console_socket = None
 
-if self.is_running():
-if self._qmp:
-try:
-if not has_quit:
-self._qmp.cmd('quit')
-self._qmp.close()
-self._popen.wait(timeout=3)
-except:
-self._popen.kill()
-self._popen.wait()
-
+self._issue_shutdown(has_quit)
 self._load_io_log()
 self._post_shutdown()
 
-- 
2.21.3

Re: [PATCH v7 7/8] qdev-properties: add getter for size32 and blocksize

2020-05-28 Thread Eric Blake


On 5/28/20 4:39 PM, Roman Kagan wrote:

Add getter for size32, and use it for blocksize, too.

In its human-readable branch, it reports approximate size in
human-readable units next to the exact byte value, like the getter for
64bit size does.

Adjust the expected test output accordingly.

Signed-off-by: Roman Kagan 
---
v6 -> v7:
- split out into separate patch [Eric]


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 6/8] block: make BlockConf size props 32bit and accept size suffixes

2020-05-28 Thread Eric Blake


On 5/28/20 4:39 PM, Roman Kagan wrote:

Convert all size-related properties in BlockConf to 32bit.  This will
allow to accomodate bigger block sizes (in a followup patch).


s/allow to accomodate/accommodate/


This also allows to make them all accept size suffixes, either via
DEFINE_PROP_BLOCKSIZE or via DEFINE_PROP_SIZE32.

Also, since min_io_size is exposed to the guest by scsi and virtio-blk
devices as an uint16_t in units of logical blocks, introduce an
additional check in blkconf_blocksizes to prevent its silent truncation.

Signed-off-by: Roman Kagan 
---



+if (conf->min_io_size / conf->logical_block_size > UINT16_MAX) {
+error_setg(errp,
+   "min_io_size must not exceed " stringify(UINT16_MAX)
+   " logical blocks");


On my libc, this results in "must not exceed (65535) logical blocks".

Worse, I could envision a platform where it prints something funky like:

"exceed (2 * (32768) + 1) logical", based on however complex the 
definition of UINT16_MAX is.  You're better off printing this one with 
%d than with stringify().


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 5/8] qdev-properties: make blocksize accept size suffixes

2020-05-28 Thread Eric Blake


On 5/28/20 4:39 PM, Roman Kagan wrote:

It appears convenient to be able to specify physical_block_size and
logical_block_size using common size suffixes.

Teach the blocksize property setter to interpret them.  Also express the
upper and lower limits in the respective units.

Signed-off-by: Roman Kagan 
---


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 4/8] qdev-properties: add size32 property type

2020-05-28 Thread Eric Blake


On 5/28/20 4:39 PM, Roman Kagan wrote:

Introduce size32 property type which handles size suffixes (k, m) just
like size property, but is uint32_t rather than uint64_t.


Does it handle 'g' as well? (even though the set of valid 32-bit sizes 
with a g suffix is rather small ;)



 It's going to
be useful for properties that are byte sizes but are inherently 32bit,
like BlkConf.opt_io_size or .discard_granularity (they are switched to
this new property type in a followup commit).

The getter for size32 is left out for a separate patch as its benefit is
less obvious, and it affects test output; for now the regular uint32
getter is used.

Signed-off-by: Roman Kagan 
---




+static void set_size32(Object *obj, Visitor *v, const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value;
+Error *local_err = NULL;
+
+if (dev->realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_size(v, name, , _err);


Yes, it does.

Whether or not the commit message is tweaked,
Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 1/2] sev: add sev-inject-launch-secret

2020-05-28 Thread Eric Blake


On 5/28/20 3:51 PM, Tobin Feldman-Fitzthum wrote:

From: Tobin Feldman-Fitzthum 

AMD SEV allows a guest owner to inject a secret blob
into the memory of a virtual machine. The secret is
encrypted with the SEV Transport Encryption Key and
integrity is guaranteed with the Transport Integrity
Key. Although QEMU faciliates the injection of the
launch secret, it cannot access the secret.

Signed-off-by: Tobin Feldman-Fitzthum 
---



+++ b/qapi/misc-target.json
@@ -200,6 +200,26 @@
  { 'command': 'query-sev-capabilities', 'returns': 'SevCapability',
'if': 'defined(TARGET_I386)' }
  
+##

+# @sev-inject-launch-secret:
+#
+# This command injects a secret blob into memory of SEV guest.
+#
+# @packet-header: the launch secret packet header encoded in base64
+#
+# @secret: the launch secret data to be injected encoded in base64
+#
+# @gpa: the guest physical address where secret will be injected.
+GPA provided here will be ignored if guest ROM specifies
+the a launch secret GPA.


Missing # on the wrapped lines.


+#
+# Since: 5.0.0


You've missed 5.0, and more sites tend to use x.y instead of x.y.z 
(although we aren't consistent); this should be 'Since: 5.1'



+#
+##
+{ 'command': 'sev-inject-launch-secret',
+  'data': { 'packet_hdr': 'str', 'secret': 'str', 'gpa': 'uint64' },


This does not match your documentation above, which named it 
'packet-header'.  Should 'gpa' be optional, to account for the case 
where ROM specifies it?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH v7 7/8] qdev-properties: add getter for size32 and blocksize

2020-05-28 Thread Roman Kagan

Add getter for size32, and use it for blocksize, too.

In its human-readable branch, it reports approximate size in
human-readable units next to the exact byte value, like the getter for
64bit size does.

Adjust the expected test output accordingly.

Signed-off-by: Roman Kagan 
---
v6 -> v7:
- split out into separate patch [Eric]

 hw/core/qdev-properties.c  |  15 +-
 tests/qemu-iotests/172.out | 530 ++---
 2 files changed, 278 insertions(+), 267 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 3cbe3f56a8..8f35d494a4 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -730,6 +730,17 @@ const PropertyInfo qdev_prop_pci_devfn = {
 
 /* --- 32bit unsigned int 'size' type --- */
 
+static void get_size32(Object *obj, Visitor *v, const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value = *ptr;
+
+visit_type_size(v, name, , errp);
+}
+
 static void set_size32(Object *obj, Visitor *v, const char *name, void *opaque,
Error **errp)
 {
@@ -763,7 +774,7 @@ static void set_size32(Object *obj, Visitor *v, const char 
*name, void *opaque,
 
 const PropertyInfo qdev_prop_size32 = {
 .name  = "size",
-.get = get_uint32,
+.get = get_size32,
 .set = set_size32,
 .set_default_value = set_default_value_uint,
 };
@@ -821,7 +832,7 @@ const PropertyInfo qdev_prop_blocksize = {
 .name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
-.get   = get_uint32,
+.get   = get_size32,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
 };
diff --git a/tests/qemu-iotests/172.out b/tests/qemu-iotests/172.out
index 59cc70aebb..e782c5957e 100644
--- a/tests/qemu-iotests/172.out
+++ b/tests/qemu-iotests/172.out
@@ -24,11 +24,11 @@ Testing:
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "288"
@@ -54,11 +54,11 @@ Testing: -fda TEST_DIR/t.qcow2
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "144"
@@ -81,22 +81,22 @@ Testing: -fdb TEST_DIR/t.qcow2
   dev: floppy, id ""
 unit = 1 (0x1)
 drive = "floppy1"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "144"
   dev: floppy, id ""
 unit = 0 (0x0)
 drive = "floppy0"
-logical_block_size = 512 (0x200)
-physical_block_size = 512 (0x200)
-min_io_size = 0 (0x0)
-opt_io_size = 0 (0x0)
-discard_granularity = 4294967295 (0x)
+logical_block_size = 512 (512 B)
+physical_block_size = 512 (512 B)
+min_io_size = 0 (0 B)
+opt_io_size = 0 (0 B)
+discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
 drive-type = "288"
@@ -119,22 +119,22 @@ Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2.2
   dev: floppy,

Re: [PATCH 0/2] linux-user: Load a vdso for x86_64

2020-05-28 Thread Richard Henderson

On 5/28/20 3:32 AM, Laurent Vivier wrote:
> Le 28/05/2020 à 12:08, Peter Maydell a écrit :
>> On Tue, 19 May 2020 at 20:45, Richard Henderson
>>  wrote:
>>>  Makefile  |   4 +-
>>>  linux-user/elfload.c  | 203 +-
>>>  pc-bios/Makefile  |   5 +
>>>  pc-bios/vdso-linux-x64.S  | 115 +
>>>  pc-bios/vdso-linux-x64.ld |  81 +++
>>>  pc-bios/vdso-linux-x64.so | Bin 0 -> 7500 bytes
>>
>> I'm not really a fan of binaries in source control :-(
> 
> Can't we see that as a firmware or a ROM?
> It's only 7,4 KB and needs a cross-compilation env to be rebuilt.
> 
> Do you have another solution?
> 
> If you don't like this I can remove the series. Let me know.

I think some more of the questions in the cover letter need answering.  Does
this patch set not break your own --static chroot tests, for example?


r~

[PATCH v7 6/8] block: make BlockConf size props 32bit and accept size suffixes

2020-05-28 Thread Roman Kagan

Convert all size-related properties in BlockConf to 32bit.  This will
allow to accomodate bigger block sizes (in a followup patch).
This also allows to make them all accept size suffixes, either via
DEFINE_PROP_BLOCKSIZE or via DEFINE_PROP_SIZE32.

Also, since min_io_size is exposed to the guest by scsi and virtio-blk
devices as an uint16_t in units of logical blocks, introduce an
additional check in blkconf_blocksizes to prevent its silent truncation.

Signed-off-by: Roman Kagan 
---
v6 -> v7:
- split out into separate patch [Eric]
- avoid overflow in min_io_size check [Eric]

 include/hw/block/block.h | 12 ++--
 include/hw/qdev-properties.h |  2 +-
 hw/block/block.c | 11 +++
 hw/core/qdev-properties.c|  4 ++--
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 784953a237..1e8b6253dd 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -18,9 +18,9 @@
 
 typedef struct BlockConf {
 BlockBackend *blk;
-uint16_t physical_block_size;
-uint16_t logical_block_size;
-uint16_t min_io_size;
+uint32_t physical_block_size;
+uint32_t logical_block_size;
+uint32_t min_io_size;
 uint32_t opt_io_size;
 int32_t bootindex;
 uint32_t discard_granularity;
@@ -51,9 +51,9 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
   _conf.logical_block_size),\
 DEFINE_PROP_BLOCKSIZE("physical_block_size", _state,\
   _conf.physical_block_size),   \
-DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),\
-DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),\
-DEFINE_PROP_UINT32("discard_granularity", _state,   \
+DEFINE_PROP_SIZE32("min_io_size", _state, _conf.min_io_size, 0),\
+DEFINE_PROP_SIZE32("opt_io_size", _state, _conf.opt_io_size, 0),\
+DEFINE_PROP_SIZE32("discard_granularity", _state,   \
_conf.discard_granularity, -1),  \
 DEFINE_PROP_ON_OFF_AUTO("write-cache", _state, _conf.wce,   \
 ON_OFF_AUTO_AUTO),  \
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index c03eadfad6..5252bb6b1a 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -200,7 +200,7 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
 #define DEFINE_PROP_SIZE32(_n, _s, _f, _d)   \
 DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size32, uint32_t)
 #define DEFINE_PROP_BLOCKSIZE(_n, _s, _f) \
-DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint16_t)
+DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint32_t)
 #define DEFINE_PROP_PCI_HOST_DEVADDR(_n, _s, _f) \
 DEFINE_PROP(_n, _s, _f, qdev_prop_pci_host_devaddr, PCIHostDeviceAddress)
 #define DEFINE_PROP_OFF_AUTO_PCIBAR(_n, _s, _f, _d) \
diff --git a/hw/block/block.c b/hw/block/block.c
index b22207c921..7410b24dee 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -96,6 +96,17 @@ bool blkconf_blocksizes(BlockConf *conf, Error **errp)
 return false;
 }
 
+/*
+ * all devices which support min_io_size (scsi and virtio-blk) expose it to
+ * the guest as a uint16_t in units of logical blocks
+ */
+if (conf->min_io_size / conf->logical_block_size > UINT16_MAX) {
+error_setg(errp,
+   "min_io_size must not exceed " stringify(UINT16_MAX)
+   " logical blocks");
+return false;
+}
+
 if (!QEMU_IS_ALIGNED(conf->opt_io_size, conf->logical_block_size)) {
 error_setg(errp,
"opt_io_size must be a multiple of logical_block_size");
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index a79062b428..3cbe3f56a8 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -782,7 +782,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
 uint64_t value;
 Error *local_err = NULL;
 
@@ -821,7 +821,7 @@ const PropertyInfo qdev_prop_blocksize = {
 .name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
-.get   = get_uint16,
+.get   = get_uint32,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
 };
-- 
2.26.2

[PATCH v7 5/8] qdev-properties: make blocksize accept size suffixes

2020-05-28 Thread Roman Kagan

It appears convenient to be able to specify physical_block_size and
logical_block_size using common size suffixes.

Teach the blocksize property setter to interpret them.  Also express the
upper and lower limits in the respective units.

Signed-off-by: Roman Kagan 
---
v6 -> v7:
- split out into separate patch [Eric]

 hw/core/qdev-properties.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index d943755832..a79062b428 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -14,6 +14,7 @@
 #include "qapi/visitor.h"
 #include "chardev/char.h"
 #include "qemu/uuid.h"
+#include "qemu/units.h"
 
 void qdev_prop_set_after_realize(DeviceState *dev, const char *name,
   Error **errp)
@@ -771,17 +772,18 @@ const PropertyInfo qdev_prop_size32 = {
 
 /* lower limit is sector size */
 #define MIN_BLOCK_SIZE  512
-#define MIN_BLOCK_SIZE_STR  stringify(MIN_BLOCK_SIZE)
+#define MIN_BLOCK_SIZE_STR  "512 B"
 /* upper limit is the max power of 2 that fits in uint16_t */
-#define MAX_BLOCK_SIZE  32768
-#define MAX_BLOCK_SIZE_STR  stringify(MAX_BLOCK_SIZE)
+#define MAX_BLOCK_SIZE  (32 * KiB)
+#define MAX_BLOCK_SIZE_STR  "32 KiB"
 
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-uint16_t value, *ptr = qdev_get_prop_ptr(dev, prop);
+uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value;
 Error *local_err = NULL;
 
 if (dev->realized) {
@@ -789,7 +791,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 return;
 }
 
-visit_type_uint16(v, name, , _err);
+visit_type_size(v, name, , _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -797,7 +799,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 /* value of 0 means "unset" */
 if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
 error_setg(errp,
-   "Property %s.%s doesn't take value %" PRIu16
+   "Property %s.%s doesn't take value %" PRIu64
" (minimum: " MIN_BLOCK_SIZE_STR
", maximum: " MAX_BLOCK_SIZE_STR ")",
dev->id ? : "", name, value);
@@ -816,7 +818,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 }
 
 const PropertyInfo qdev_prop_blocksize = {
-.name  = "uint16",
+.name  = "size",
 .description = "A power of two between " MIN_BLOCK_SIZE_STR
" and " MAX_BLOCK_SIZE_STR,
 .get   = get_uint16,
-- 
2.26.2

[PATCH v7 3/8] qdev-properties: blocksize: use same limits in code and description

2020-05-28 Thread Roman Kagan

Make it easier (more visible) to maintain the limits on the blocksize
properties in sync with the respective description, by using macros both
in the code and in the description.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
v4 -> v5:
- split out into separate patch [Philippe]

 hw/core/qdev-properties.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index cc924815da..249dc69bd8 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -729,6 +729,13 @@ const PropertyInfo qdev_prop_pci_devfn = {
 
 /* --- blocksize --- */
 
+/* lower limit is sector size */
+#define MIN_BLOCK_SIZE  512
+#define MIN_BLOCK_SIZE_STR  stringify(MIN_BLOCK_SIZE)
+/* upper limit is the max power of 2 that fits in uint16_t */
+#define MAX_BLOCK_SIZE  32768
+#define MAX_BLOCK_SIZE_STR  stringify(MAX_BLOCK_SIZE)
+
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -736,8 +743,6 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 Property *prop = opaque;
 uint16_t value, *ptr = qdev_get_prop_ptr(dev, prop);
 Error *local_err = NULL;
-const int64_t min = 512;
-const int64_t max = 32768;
 
 if (dev->realized) {
 qdev_prop_set_after_realize(dev, name, errp);
@@ -750,9 +755,12 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 return;
 }
 /* value of 0 means "unset" */
-if (value && (value < min || value > max)) {
-error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
-   dev->id ? : "", name, (int64_t)value, min, max);
+if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
+error_setg(errp,
+   "Property %s.%s doesn't take value %" PRIu16
+   " (minimum: " MIN_BLOCK_SIZE_STR
+   ", maximum: " MAX_BLOCK_SIZE_STR ")",
+   dev->id ? : "", name, value);
 return;
 }
 
@@ -769,7 +777,8 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 
 const PropertyInfo qdev_prop_blocksize = {
 .name  = "uint16",
-.description = "A power of two between 512 and 32768",
+.description = "A power of two between " MIN_BLOCK_SIZE_STR
+   " and " MAX_BLOCK_SIZE_STR,
 .get   = get_uint16,
 .set   = set_blocksize,
 .set_default_value = set_default_value_uint,
-- 
2.26.2

[PATCH v7 8/8] block: lift blocksize property limit to 2 MiB

2020-05-28 Thread Roman Kagan

Logical and physical block sizes in QEMU are limited to 32 KiB.

This appears unnecessarily tight, and we've seen bigger block sizes
handy at times.

Lift the limitation up to 2 MiB which appears to be good enough for
everybody, and matches the qcow2 cluster size limit.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
---
v6 -> v7:
- fix spelling in the log [Eric]

 hw/core/qdev-properties.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 8f35d494a4..d66a498d36 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -784,9 +784,12 @@ const PropertyInfo qdev_prop_size32 = {
 /* lower limit is sector size */
 #define MIN_BLOCK_SIZE  512
 #define MIN_BLOCK_SIZE_STR  "512 B"
-/* upper limit is the max power of 2 that fits in uint16_t */
-#define MAX_BLOCK_SIZE  (32 * KiB)
-#define MAX_BLOCK_SIZE_STR  "32 KiB"
+/*
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
+ * matches qcow2 cluster size limit
+ */
+#define MAX_BLOCK_SIZE  (2 * MiB)
+#define MAX_BLOCK_SIZE_STR  "2 MiB"
 
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
-- 
2.26.2

[PATCH v7 0/8] block: enhance handling of size-related BlockConf properties

2020-05-28 Thread Roman Kagan

BlockConf includes several properties counted in bytes.

Enhance their handling in some aspects, specifically

- accept common size suffixes (k, m)
- perform consistency checks on the values
- lift the upper limit on physical_block_size and logical_block_size

Also fix the accessor for opt_io_size in virtio-blk to make it consistent with
the size of the field.

History:
v6 -> v7:
- avoid overflow in min_io_size check [Eric]
- try again to perform the art form in patch splitting [Eric]

v5 -> v6:
- fix forgotten xen-block and swim
- add prop_size32 instead of going with 64bit

v4 -> v5:
- re-split the patches [Philippe]
- fix/reword error messages [Philippe, Kevin]
- do early return on failed consistency check [Philippe]
- use QEMU_IS_ALIGNED instead of open coding [Philippe]
- make all BlockConf size props support suffixes
- expand the log for virtio-blk opt_io_size [Michael]

v3 -> v4:
- add patch to fix opt_io_size width in virtio-blk
- add patch to perform consistency checks [Kevin]
- check min_io_size against truncation [Kevin]

v2 -> v3:
- mention qcow2 cluster size limit in the log and comment [Eric]

v1 -> v2:
- cap the property at 2 MiB [Eric]
- accept size suffixes

Roman Kagan (8):
  virtio-blk: store opt_io_size with correct size
  block: consolidate blocksize properties consistency checks
  qdev-properties: blocksize: use same limits in code and description
  qdev-properties: add size32 property type
  qdev-properties: make blocksize accept size suffixes
  block: make BlockConf size props 32bit and accept size suffixes
  qdev-properties: add getter for size32 and blocksize
  block: lift blocksize property limit to 2 MiB

 include/hw/block/block.h |  14 +-
 include/hw/qdev-properties.h |   5 +-
 hw/block/block.c |  41 ++-
 hw/block/fdc.c   |   5 +-
 hw/block/nvme.c  |   5 +-
 hw/block/swim.c  |   5 +-
 hw/block/virtio-blk.c|   9 +-
 hw/block/xen-block.c |   6 +-
 hw/core/qdev-properties.c|  85 +-
 hw/ide/qdev.c|   5 +-
 hw/scsi/scsi-disk.c  |  12 +-
 hw/usb/dev-storage.c |   5 +-
 tests/qemu-iotests/172.out   | 532 +--
 13 files changed, 420 insertions(+), 309 deletions(-)

-- 
2.26.2

[PATCH v7 2/8] block: consolidate blocksize properties consistency checks

2020-05-28 Thread Roman Kagan

Several block device properties related to blocksize configuration must
be in certain relationship WRT each other: physical block must be no
smaller than logical block; min_io_size, opt_io_size, and
discard_granularity must be a multiple of a logical block.

To ensure these requirements are met, add corresponding consistency
checks to blkconf_blocksizes, adjusting its signature to communicate
possible error to the caller.  Also remove the now redundant consistency
checks from the specific devices.

Signed-off-by: Roman Kagan 
Reviewed-by: Eric Blake 
Reviewed-by: Paul Durrant 
---
v5 -> v6:
- fix forgotten xen-block and swim

v4 -> v5:
- fix/reword error messages [Philippe, Kevin]
- do early return on failed consistency check [Philippe]
- use QEMU_IS_ALIGNED instead of open coding [Philippe]

 include/hw/block/block.h   |  2 +-
 hw/block/block.c   | 30 +-
 hw/block/fdc.c |  5 -
 hw/block/nvme.c|  5 -
 hw/block/swim.c|  5 -
 hw/block/virtio-blk.c  |  7 +--
 hw/block/xen-block.c   |  6 +-
 hw/ide/qdev.c  |  5 -
 hw/scsi/scsi-disk.c| 12 +---
 hw/usb/dev-storage.c   |  5 -
 tests/qemu-iotests/172.out |  2 +-
 11 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index d7246f3862..784953a237 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -87,7 +87,7 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void 
*buf, hwaddr size,
 bool blkconf_geometry(BlockConf *conf, int *trans,
   unsigned cyls_max, unsigned heads_max, unsigned secs_max,
   Error **errp);
-void blkconf_blocksizes(BlockConf *conf);
+bool blkconf_blocksizes(BlockConf *conf, Error **errp);
 bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
bool resizable, Error **errp);
 
diff --git a/hw/block/block.c b/hw/block/block.c
index bf56c7612b..b22207c921 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -61,7 +61,7 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void 
*buf, hwaddr size,
 return true;
 }
 
-void blkconf_blocksizes(BlockConf *conf)
+bool blkconf_blocksizes(BlockConf *conf, Error **errp)
 {
 BlockBackend *blk = conf->blk;
 BlockSizes blocksizes;
@@ -83,6 +83,34 @@ void blkconf_blocksizes(BlockConf *conf)
 conf->logical_block_size = BDRV_SECTOR_SIZE;
 }
 }
+
+if (conf->logical_block_size > conf->physical_block_size) {
+error_setg(errp,
+   "logical_block_size > physical_block_size not supported");
+return false;
+}
+
+if (!QEMU_IS_ALIGNED(conf->min_io_size, conf->logical_block_size)) {
+error_setg(errp,
+   "min_io_size must be a multiple of logical_block_size");
+return false;
+}
+
+if (!QEMU_IS_ALIGNED(conf->opt_io_size, conf->logical_block_size)) {
+error_setg(errp,
+   "opt_io_size must be a multiple of logical_block_size");
+return false;
+}
+
+if (conf->discard_granularity != -1 &&
+!QEMU_IS_ALIGNED(conf->discard_granularity,
+ conf->logical_block_size)) {
+error_setg(errp, "discard_granularity must be "
+   "a multiple of logical_block_size");
+return false;
+}
+
+return true;
 }
 
 bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index c5fb9d6ece..8eda572ef4 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -554,7 +554,10 @@ static void floppy_drive_realize(DeviceState *qdev, Error 
**errp)
 read_only = !blk_bs(dev->conf.blk) || blk_is_read_only(dev->conf.blk);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (dev->conf.logical_block_size != 512 ||
 dev->conf.physical_block_size != 512)
 {
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 2f3100e56c..672650e162 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1390,7 +1390,10 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
 host_memory_backend_set_mapped(n->pmrdev, true);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (!blkconf_apply_backend_options(>conf, blk_is_read_only(n->conf.blk),
false, errp)) {
 return;
diff --git a/hw/block/swim.c b/hw/block/swim.c
index 8f124782f4..74f56e8f46 100644
--- a/hw/block/swim.c
+++ b/hw/block/swim.c
@@ -189,7 +189,10 @@ static void swim_drive_realize(DeviceState *qdev, Error 
**errp)
 assert(ret == 0);
 }
 
-blkconf_blocksizes(>conf);
+if (!blkconf_blocksizes(>conf, errp)) {
+return;
+}
+
 if (dev->conf.logical_block_size != 512 ||

[PATCH v7 1/8] virtio-blk: store opt_io_size with correct size

2020-05-28 Thread Roman Kagan

The width of opt_io_size in virtio_blk_config is 32bit.  However, it's
written with virtio_stw_p; this may result in value truncation, and on
big-endian systems with legacy virtio in completely bogus readings in
the guest.

Use the appropriate accessor to store it.

Signed-off-by: Roman Kagan 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kevin Wolf 
---
v4 -> v5:
- split out into separate patch [Philippe]

 hw/block/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f5f6fc925e..413083e62f 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -918,7 +918,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 virtio_stw_p(vdev, , conf->cyls);
 virtio_stl_p(vdev, _size, blk_size);
 virtio_stw_p(vdev, _io_size, conf->min_io_size / blk_size);
-virtio_stw_p(vdev, _io_size, conf->opt_io_size / blk_size);
+virtio_stl_p(vdev, _io_size, conf->opt_io_size / blk_size);
 blkcfg.geometry.heads = conf->heads;
 /*
  * We must ensure that the block device capacity is a multiple of
-- 
2.26.2

[PATCH v7 4/8] qdev-properties: add size32 property type

2020-05-28 Thread Roman Kagan

Introduce size32 property type which handles size suffixes (k, m) just
like size property, but is uint32_t rather than uint64_t.  It's going to
be useful for properties that are byte sizes but are inherently 32bit,
like BlkConf.opt_io_size or .discard_granularity (they are switched to
this new property type in a followup commit).

The getter for size32 is left out for a separate patch as its benefit is
less obvious, and it affects test output; for now the regular uint32
getter is used.

Signed-off-by: Roman Kagan 
---
v6 -> v7:
- split out into separate patch [Eric]

 include/hw/qdev-properties.h |  3 +++
 hw/core/qdev-properties.c| 40 
 2 files changed, 43 insertions(+)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index f161604fb6..c03eadfad6 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -29,6 +29,7 @@ extern const PropertyInfo qdev_prop_drive;
 extern const PropertyInfo qdev_prop_drive_iothread;
 extern const PropertyInfo qdev_prop_netdev;
 extern const PropertyInfo qdev_prop_pci_devfn;
+extern const PropertyInfo qdev_prop_size32;
 extern const PropertyInfo qdev_prop_blocksize;
 extern const PropertyInfo qdev_prop_pci_host_devaddr;
 extern const PropertyInfo qdev_prop_uuid;
@@ -196,6 +197,8 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
 BlockdevOnError)
 #define DEFINE_PROP_BIOS_CHS_TRANS(_n, _s, _f, _d) \
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_bios_chs_trans, int)
+#define DEFINE_PROP_SIZE32(_n, _s, _f, _d)   \
+DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size32, uint32_t)
 #define DEFINE_PROP_BLOCKSIZE(_n, _s, _f) \
 DEFINE_PROP_UNSIGNED(_n, _s, _f, 0, qdev_prop_blocksize, uint16_t)
 #define DEFINE_PROP_PCI_HOST_DEVADDR(_n, _s, _f) \
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 249dc69bd8..d943755832 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -727,6 +727,46 @@ const PropertyInfo qdev_prop_pci_devfn = {
 .set_default_value = set_default_value_int,
 };
 
+/* --- 32bit unsigned int 'size' type --- */
+
+static void set_size32(Object *obj, Visitor *v, const char *name, void *opaque,
+   Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
+uint64_t value;
+Error *local_err = NULL;
+
+if (dev->realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_size(v, name, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+if (value > UINT32_MAX) {
+error_setg(errp,
+   "Property %s.%s doesn't take value %" PRIu64
+   " (maximum: " stringify(UINT32_MAX) ")",
+   dev->id ? : "", name, value);
+return;
+}
+
+*ptr = value;
+}
+
+const PropertyInfo qdev_prop_size32 = {
+.name  = "size",
+.get = get_uint32,
+.set = set_size32,
+.set_default_value = set_default_value_uint,
+};
+
 /* --- blocksize --- */
 
 /* lower limit is sector size */
-- 
2.26.2

[PATCH 0/2] Add support for SEV Launch Secret Injection

2020-05-28 Thread Tobin Feldman-Fitzthum

This patchset contains two patches. The first enables QEMU
to facilitate the injection of a secret blob into the guest
memory.

The second enables QEMU to parse the guest ROM to determine
the address at which the secret should be injected.

Tobin Feldman-Fitzthum (2):
  sev: add sev-inject-launch-secret
  sev: scan guest ROM for launch secret address

 include/sysemu/sev.h   |   2 +
 qapi/misc-target.json  |  20 +++
 target/i386/monitor.c  |   8 +++
 target/i386/sev-stub.c |   5 ++
 target/i386/sev.c  | 113 +
 target/i386/sev_i386.h |  16 ++
 target/i386/trace-events   |   1 +
 tests/qtest/qmp-cmd-test.c |   6 +-
 8 files changed, 168 insertions(+), 3 deletions(-)

-- 
2.20.1 (Apple Git-117)

Re: [PATCH 1/2] sev: add sev-inject-launch-secret

2020-05-28 Thread James Bottomley

On Thu, 2020-05-28 at 16:51 -0400, Tobin Feldman-Fitzthum wrote:
> --- a/qapi/misc-target.json
> +++ b/qapi/misc-target.json
> @@ -200,6 +200,26 @@
>  { 'command': 'query-sev-capabilities', 'returns': 'SevCapability',
>'if': 'defined(TARGET_I386)' }
>  
> +##
> +# @sev-inject-launch-secret:
> +#
> +# This command injects a secret blob into memory of SEV guest.
> +#
> +# @packet-header: the launch secret packet header encoded in base64
> +#
> +# @secret: the launch secret data to be injected encoded in base64
> +#
> +# @gpa: the guest physical address where secret will be injected.
> +GPA provided here will be ignored if guest ROM specifies 
> +the a launch secret GPA.

Shouldn't we eliminate the gpa argument to this now the gpa is
extracted from OVMF?  You add it here but don't take it out in the next
patch.

> +# Since: 5.0.0
> +#
> +##
> +{ 'command': 'sev-inject-launch-secret',
> +  'data': { 'packet_hdr': 'str', 'secret': 'str', 'gpa': 'uint64' },

Java (i.e. Json) people hate underscores and abbreviations.  I bet
they'll want this to be 'packet-header'

> +  'if': 'defined(TARGET_I386)' }
> +
>  ##
>  # @dump-skeys:
>  #
> diff --git a/target/i386/monitor.c b/target/i386/monitor.c
> index 27ebfa3ad2..5c2b7d2c17 100644
> --- a/target/i386/monitor.c
> +++ b/target/i386/monitor.c
> @@ -736,3 +736,11 @@ SevCapability *qmp_query_sev_capabilities(Error
> **errp)
>  
>  return data;
>  }
> +
> +void qmp_sev_inject_launch_secret(const char *packet_hdr,
> +  const char *secret, uint64_t gpa,
> +  Error **errp)
> +{
> +if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
> +  error_setg(errp, "SEV inject secret failed");
> +}
> diff --git a/target/i386/sev-stub.c b/target/i386/sev-stub.c
> index e5ee13309c..2b8c5f1f53 100644
> --- a/target/i386/sev-stub.c
> +++ b/target/i386/sev-stub.c
> @@ -48,3 +48,8 @@ SevCapability *sev_get_capabilities(void)
>  {
>  return NULL;
>  }
> +int sev_inject_launch_secret(const char *hdr, const char *secret,
> +  uint64_t gpa)
> +{
> + return 1;
> +}
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 846018a12d..774e47d9d1 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -28,6 +28,7 @@
>  #include "sysemu/runstate.h"
>  #include "trace.h"
>  #include "migration/blocker.h"
> +#include "exec/address-spaces.h"
>  
>  #define DEFAULT_GUEST_POLICY0x1 /* disable debug */
>  #define DEFAULT_SEV_DEVICE  "/dev/sev"
> @@ -743,6 +744,88 @@ sev_encrypt_data(void *handle, uint8_t *ptr,
> uint64_t len)
>  return 0;
>  }
>  
> +
> +static void *
> +gpa2hva(hwaddr addr, uint64_t size)
> +{
> +MemoryRegionSection mrs =
> memory_region_find(get_system_memory(),
> + addr, size);
> +
> +if (!mrs.mr) {
> +error_report("No memory is mapped at address 0x%"
> HWADDR_PRIx, addr);
> +return NULL;
> +}
> +
> +if (!memory_region_is_ram(mrs.mr) &&
> !memory_region_is_romd(mrs.mr)) {
> +error_report("Memory at address 0x%" HWADDR_PRIx "is not
> RAM", addr);
> +memory_region_unref(mrs.mr);
> +return NULL;
> +}

We can still check this, but it should be like an assertion failure. 
Since the GPA is selected by the OVMF build there should be no way it
can't be mapped into the host.

[...]
> --- a/tests/qtest/qmp-cmd-test.c
> +++ b/tests/qtest/qmp-cmd-test.c
> @@ -93,10 +93,10 @@ static bool query_is_blacklisted(const char *cmd)
>  /* Success depends on target-specific build configuration:
> */
>  "query-pci",  /* CONFIG_PCI */
>  /* Success depends on launching SEV guest */
> -"query-sev-launch-measure",
> +// "query-sev-launch-measure",
>  /* Success depends on Host or Hypervisor SEV support */
> -"query-sev",
> -"query-sev-capabilities",
> +// "query-sev",
> +// "query-sev-capabilities",

We're eliminating existing tests ... is that just a stray hunk that you
forgot to remove?

James

[PATCH 1/2] sev: add sev-inject-launch-secret

2020-05-28 Thread Tobin Feldman-Fitzthum

From: Tobin Feldman-Fitzthum 

AMD SEV allows a guest owner to inject a secret blob
into the memory of a virtual machine. The secret is
encrypted with the SEV Transport Encryption Key and
integrity is guaranteed with the Transport Integrity
Key. Although QEMU faciliates the injection of the
launch secret, it cannot access the secret.

Signed-off-by: Tobin Feldman-Fitzthum 
---
 include/sysemu/sev.h   |  2 +
 qapi/misc-target.json  | 20 +
 target/i386/monitor.c  |  8 
 target/i386/sev-stub.c |  5 +++
 target/i386/sev.c  | 83 ++
 target/i386/trace-events   |  1 +
 tests/qtest/qmp-cmd-test.c |  6 +--
 7 files changed, 122 insertions(+), 3 deletions(-)

diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h
index 98c1ec8d38..313ee30fc8 100644
--- a/include/sysemu/sev.h
+++ b/include/sysemu/sev.h
@@ -18,4 +18,6 @@
 
 void *sev_guest_init(const char *id);
 int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len);
+int sev_inject_launch_secret(const char *hdr, const char *secret,
+uint64_t gpa);
 #endif
diff --git a/qapi/misc-target.json b/qapi/misc-target.json
index dee3b45930..27458b765b 100644
--- a/qapi/misc-target.json
+++ b/qapi/misc-target.json
@@ -200,6 +200,26 @@
 { 'command': 'query-sev-capabilities', 'returns': 'SevCapability',
   'if': 'defined(TARGET_I386)' }
 
+##
+# @sev-inject-launch-secret:
+#
+# This command injects a secret blob into memory of SEV guest.
+#
+# @packet-header: the launch secret packet header encoded in base64
+#
+# @secret: the launch secret data to be injected encoded in base64
+#
+# @gpa: the guest physical address where secret will be injected.
+GPA provided here will be ignored if guest ROM specifies 
+the a launch secret GPA.
+#
+# Since: 5.0.0
+#
+##
+{ 'command': 'sev-inject-launch-secret',
+  'data': { 'packet_hdr': 'str', 'secret': 'str', 'gpa': 'uint64' },
+  'if': 'defined(TARGET_I386)' }
+
 ##
 # @dump-skeys:
 #
diff --git a/target/i386/monitor.c b/target/i386/monitor.c
index 27ebfa3ad2..5c2b7d2c17 100644
--- a/target/i386/monitor.c
+++ b/target/i386/monitor.c
@@ -736,3 +736,11 @@ SevCapability *qmp_query_sev_capabilities(Error **errp)
 
 return data;
 }
+
+void qmp_sev_inject_launch_secret(const char *packet_hdr,
+  const char *secret, uint64_t gpa,
+  Error **errp)
+{
+if (sev_inject_launch_secret(packet_hdr,secret,gpa) != 0)
+  error_setg(errp, "SEV inject secret failed");
+}
diff --git a/target/i386/sev-stub.c b/target/i386/sev-stub.c
index e5ee13309c..2b8c5f1f53 100644
--- a/target/i386/sev-stub.c
+++ b/target/i386/sev-stub.c
@@ -48,3 +48,8 @@ SevCapability *sev_get_capabilities(void)
 {
 return NULL;
 }
+int sev_inject_launch_secret(const char *hdr, const char *secret,
+uint64_t gpa)
+{
+   return 1;
+}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 846018a12d..774e47d9d1 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -28,6 +28,7 @@
 #include "sysemu/runstate.h"
 #include "trace.h"
 #include "migration/blocker.h"
+#include "exec/address-spaces.h"
 
 #define DEFAULT_GUEST_POLICY0x1 /* disable debug */
 #define DEFAULT_SEV_DEVICE  "/dev/sev"
@@ -743,6 +744,88 @@ sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
 return 0;
 }
 
+
+static void *
+gpa2hva(hwaddr addr, uint64_t size)
+{
+MemoryRegionSection mrs = memory_region_find(get_system_memory(),
+ addr, size);
+
+if (!mrs.mr) {
+error_report("No memory is mapped at address 0x%" HWADDR_PRIx, addr);
+return NULL;
+}
+
+if (!memory_region_is_ram(mrs.mr) && !memory_region_is_romd(mrs.mr)) {
+error_report("Memory at address 0x%" HWADDR_PRIx "is not RAM", addr);
+memory_region_unref(mrs.mr);
+return NULL;
+}
+
+return qemu_map_ram_ptr(mrs.mr->ram_block, mrs.offset_within_region);
+}
+
+int sev_inject_launch_secret(const char *packet_hdr,
+ const char *secret, uint64_t gpa)
+{
+struct kvm_sev_launch_secret *input = NULL;
+guchar *data = NULL, *hdr = NULL;
+int error, ret = 1;
+void *hva;
+gsize hdr_sz = 0, data_sz = 0;
+
+/* secret can be inject only in this state */
+if (!sev_check_state(SEV_STATE_LAUNCH_SECRET)) {
+   error_report("Not in correct state. %x",sev_state->state);
+   return 1;
+}
+
+hdr = g_base64_decode(packet_hdr, _sz);
+if (!hdr || !hdr_sz) {
+error_report("SEV: Failed to decode sequence header");
+return 1;
+}
+
+data = g_base64_decode(secret, _sz);
+if (!data || !data_sz) {
+error_report("SEV: Failed to decode data");
+goto err;
+}
+
+hva = gpa2hva(gpa, data_sz);
+if (!hva) {
+goto err;
+}
+input = g_new0(struct kvm_sev_launch_secret, 1);
+
+

[PATCH 2/2] sev: scan guest ROM for launch secret address

2020-05-28 Thread Tobin Feldman-Fitzthum

From: Tobin Feldman-Fitzthum 

In addition to using QMP to provide the guest memory address
that the launch secret blob will be injected into, the
secret address can also be specified in the guest ROM. This
patch adds sev_find_secret_gpa, which scans the ROM page by
page to find a launch secret table identified by a GUID. If
the table is found, the address it contains will be used
in place of any address specified via QMP.

Signed-off-by: Tobin Feldman-Fitzthum 
---
 target/i386/sev.c  | 34 --
 target/i386/sev_i386.h | 16 
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 774e47d9d1..4adc56d7e3 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -706,6 +706,8 @@ sev_guest_init(const char *id)
 s->api_major = status.api_major;
 s->api_minor = status.api_minor;
 
+s->secret_gpa = 0;
+
 trace_kvm_sev_init();
 ret = sev_ioctl(s->sev_fd, KVM_SEV_INIT, NULL, _error);
 if (ret) {
@@ -731,6 +733,28 @@ err:
 return NULL;
 }
 
+static void
+sev_find_secret_gpa(uint8_t *ptr, uint64_t len)
+{
+uint64_t offset;
+
+SevROMSecretTable *secret_table;
+QemuUUID secret_table_guid;
+
+qemu_uuid_parse(SEV_ROM_SECRET_GUID,_table_guid);
+secret_table_guid = qemu_uuid_bswap(secret_table_guid);
+
+offset = len - 0x1000;
+while(offset > 0) {
+secret_table = (SevROMSecretTable *)(ptr + offset);
+if(qemu_uuid_is_equal(_table_guid, (QemuUUID *) secret_table)){
+sev_state->secret_gpa = (long unsigned int) secret_table->base;
+break;
+}
+offset -= 0x1000;
+}
+}
+
 int
 sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
 {
@@ -738,6 +762,9 @@ sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
 
 /* if SEV is in update state then encrypt the data else do nothing */
 if (sev_check_state(SEV_STATE_LAUNCH_UPDATE)) {
+if(!sev_state->secret_gpa) {
+sev_find_secret_gpa(ptr, len);
+   }
 return sev_launch_update_data(ptr, len);
 }
 
@@ -776,8 +803,8 @@ int sev_inject_launch_secret(const char *packet_hdr,
 
 /* secret can be inject only in this state */
 if (!sev_check_state(SEV_STATE_LAUNCH_SECRET)) {
-   error_report("Not in correct state. %x",sev_state->state);
-   return 1;
+error_report("Not in correct state. %x",sev_state->state);
+return 1;
 }
 
 hdr = g_base64_decode(packet_hdr, _sz);
@@ -792,6 +819,9 @@ int sev_inject_launch_secret(const char *packet_hdr,
 goto err;
 }
 
+if(sev_state->secret_gpa)
+gpa = sev_state->secret_gpa;
+
 hva = gpa2hva(gpa, data_sz);
 if (!hva) {
 goto err;
diff --git a/target/i386/sev_i386.h b/target/i386/sev_i386.h
index 8ada9d385d..b1f9ab93bb 100644
--- a/target/i386/sev_i386.h
+++ b/target/i386/sev_i386.h
@@ -19,6 +19,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sev.h"
 #include "qemu/error-report.h"
+#include "qemu/uuid.h"
 #include "qapi/qapi-types-misc-target.h"
 
 #define SEV_POLICY_NODBG0x1
@@ -28,6 +29,8 @@
 #define SEV_POLICY_DOMAIN   0x10
 #define SEV_POLICY_SEV  0x20
 
+#define SEV_ROM_SECRET_GUID "adf956ad-e98c-484c-ae11-b51c7d336447"
+
 #define TYPE_QSEV_GUEST_INFO "sev-guest"
 #define QSEV_GUEST_INFO(obj)  \
 OBJECT_CHECK(QSevGuestInfo, (obj), TYPE_QSEV_GUEST_INFO)
@@ -42,6 +45,18 @@ extern SevCapability *sev_get_capabilities(void);
 
 typedef struct QSevGuestInfo QSevGuestInfo;
 typedef struct QSevGuestInfoClass QSevGuestInfoClass;
+typedef struct SevROMSecretTable SevROMSecretTable;
+
+/**
+ * If guest physical address for the launch secret is
+ * provided in the ROM, it should be in the following
+ * page-aligned structure.
+ */
+struct SevROMSecretTable {
+QemuUUID guid;
+unsigned int base;
+unsigned int size;
+};
 
 /**
  * QSevGuestInfo:
@@ -78,6 +93,7 @@ struct SEVState {
 uint32_t cbitpos;
 uint32_t reduced_phys_bits;
 uint32_t handle;
+uint64_t secret_gpa;
 int sev_fd;
 SevState state;
 gchar *measurement;
-- 
2.20.1 (Apple Git-117)

[PATCH Kernel v24 6/8] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap

2020-05-28 Thread Kirti Wankhede

DMA mapped pages, including those pinned by mdev vendor drivers, might
get unpinned and unmapped while migration is active and device is still
running. For example, in pre-copy phase while guest driver could access
those pages, host device or vendor driver can dirty these mapped pages.
Such pages should be marked dirty so as to maintain memory consistency
for a user making use of dirty page tracking.

To get bitmap during unmap, user should allocate memory for bitmap, set
it all zeros, set size of allocated memory, set page size to be
considered for bitmap and set flag VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 drivers/vfio/vfio_iommu_type1.c | 61 +
 include/uapi/linux/vfio.h   | 11 
 2 files changed, 61 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 057614c90900..1c240d47d681 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1049,23 +1049,25 @@ static int verify_bitmap_size(uint64_t npages, uint64_t 
bitmap_size)
 }
 
 static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
-struct vfio_iommu_type1_dma_unmap *unmap)
+struct vfio_iommu_type1_dma_unmap *unmap,
+struct vfio_bitmap *bitmap)
 {
-   uint64_t mask;
struct vfio_dma *dma, *dma_last = NULL;
-   size_t unmapped = 0;
+   size_t unmapped = 0, pgsize;
int ret = 0, retries = 0;
+   unsigned long pgshift;
 
mutex_lock(>lock);
 
-   mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1;
+   pgshift = __ffs(iommu->pgsize_bitmap);
+   pgsize = (size_t)1 << pgshift;
 
-   if (unmap->iova & mask) {
+   if (unmap->iova & (pgsize - 1)) {
ret = -EINVAL;
goto unlock;
}
 
-   if (!unmap->size || unmap->size & mask) {
+   if (!unmap->size || unmap->size & (pgsize - 1)) {
ret = -EINVAL;
goto unlock;
}
@@ -1076,9 +1078,15 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
goto unlock;
}
 
-   WARN_ON(mask & PAGE_MASK);
-again:
+   /* When dirty tracking is enabled, allow only min supported pgsize */
+   if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
+   (!iommu->dirty_page_tracking || (bitmap->pgsize != pgsize))) {
+   ret = -EINVAL;
+   goto unlock;
+   }
 
+   WARN_ON((pgsize - 1) & PAGE_MASK);
+again:
/*
 * vfio-iommu-type1 (v1) - User mappings were coalesced together to
 * avoid tracking individual mappings.  This means that the granularity
@@ -1159,6 +1167,14 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
mutex_lock(>lock);
goto again;
}
+
+   if (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
+   ret = update_user_bitmap(bitmap->data, dma,
+unmap->iova, pgsize);
+   if (ret)
+   break;
+   }
+
unmapped += dma->size;
vfio_remove_dma(iommu, dma);
}
@@ -2497,17 +2513,40 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
struct vfio_iommu_type1_dma_unmap unmap;
-   long ret;
+   struct vfio_bitmap bitmap = { 0 };
+   int ret;
 
minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size);
 
if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;
 
-   if (unmap.argsz < minsz || unmap.flags)
+   if (unmap.argsz < minsz ||
+   unmap.flags & ~VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP)
return -EINVAL;
 
-   ret = vfio_dma_do_unmap(iommu, );
+   if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
+   unsigned long pgshift;
+
+   if (unmap.argsz < (minsz + sizeof(bitmap)))
+   return -EINVAL;
+
+   if (copy_from_user(,
+  (void __user *)(arg + minsz),
+  sizeof(bitmap)))
+   return -EFAULT;
+
+   if (!access_ok((void __user *)bitmap.data, bitmap.size))
+   return -EINVAL;
+
+   pgshift = __ffs(bitmap.pgsize);
+   ret = verify_bitmap_size(unmap.size >> pgshift,
+bitmap.size);
+   if (ret)
+

[PATCH Kernel v24 8/8] vfio: Selective dirty page tracking if IOMMU backed device pins pages

2020-05-28 Thread Kirti Wankhede

Added a check such that only singleton IOMMU groups can pin pages.
>From the point when vendor driver pins any pages, consider IOMMU group
dirty page scope to be limited to pinned pages.

To optimize to avoid walking list often, added flag
pinned_page_dirty_scope to indicate if all of the vfio_groups for each
vfio_domain in the domain_list dirty page scope is limited to pinned
pages. This flag is updated on first pinned pages request for that IOMMU
group and on attaching/detaching group.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Yan Zhao 
---
 drivers/vfio/vfio.c |  13 +++--
 drivers/vfio/vfio_iommu_type1.c | 103 +---
 include/linux/vfio.h|   4 +-
 3 files changed, 109 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 765e0e5d83ed..580099afeaff 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -85,6 +85,7 @@ struct vfio_group {
atomic_topened;
wait_queue_head_t   container_q;
boolnoiommu;
+   unsigned intdev_counter;
struct kvm  *kvm;
struct blocking_notifier_head   notifier;
 };
@@ -555,6 +556,7 @@ struct vfio_device *vfio_group_create_device(struct 
vfio_group *group,
 
mutex_lock(>device_lock);
list_add(>group_next, >device_list);
+   group->dev_counter++;
mutex_unlock(>device_lock);
 
return device;
@@ -567,6 +569,7 @@ static void vfio_device_release(struct kref *kref)
struct vfio_group *group = device->group;
 
list_del(>group_next);
+   group->dev_counter--;
mutex_unlock(>device_lock);
 
dev_set_drvdata(device->dev, NULL);
@@ -1945,6 +1948,9 @@ int vfio_pin_pages(struct device *dev, unsigned long 
*user_pfn, int npage,
if (!group)
return -ENODEV;
 
+   if (group->dev_counter > 1)
+   return -EINVAL;
+
ret = vfio_group_add_container_user(group);
if (ret)
goto err_pin_pages;
@@ -1952,7 +1958,8 @@ int vfio_pin_pages(struct device *dev, unsigned long 
*user_pfn, int npage,
container = group->container;
driver = container->iommu_driver;
if (likely(driver && driver->ops->pin_pages))
-   ret = driver->ops->pin_pages(container->iommu_data, user_pfn,
+   ret = driver->ops->pin_pages(container->iommu_data,
+group->iommu_group, user_pfn,
 npage, prot, phys_pfn);
else
ret = -ENOTTY;
@@ -2050,8 +2057,8 @@ int vfio_group_pin_pages(struct vfio_group *group,
driver = container->iommu_driver;
if (likely(driver && driver->ops->pin_pages))
ret = driver->ops->pin_pages(container->iommu_data,
-user_iova_pfn, npage,
-prot, phys_pfn);
+group->iommu_group, user_iova_pfn,
+npage, prot, phys_pfn);
else
ret = -ENOTTY;
 
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index f27c29df6fc5..97a29bc04d5d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -73,6 +73,7 @@ struct vfio_iommu {
boolv2;
boolnesting;
booldirty_page_tracking;
+   boolpinned_page_dirty_scope;
 };
 
 struct vfio_domain {
@@ -100,6 +101,7 @@ struct vfio_group {
struct iommu_group  *iommu_group;
struct list_headnext;
boolmdev_group; /* An mdev group */
+   boolpinned_page_dirty_scope;
 };
 
 struct vfio_iova {
@@ -143,6 +145,10 @@ struct vfio_regions {
 
 static int put_pfn(unsigned long pfn, int prot);
 
+static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
+  struct iommu_group *iommu_group);
+
+static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu);
 /*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
@@ -622,11 +628,13 @@ static int vfio_unpin_page_external(struct vfio_dma *dma, 
dma_addr_t iova,
 }
 
 static int vfio_iommu_type1_pin_pages(void *iommu_data,
+ struct iommu_group *iommu_group,
  unsigned long *user_pfn,
  int npage, int prot,
  unsigned long *phys_pfn)
 {
struct vfio_iommu *iommu = iommu_data;
+   struct vfio_group *group;
int i, j, ret;
unsigned long remote_vaddr;

[PATCH Kernel v24 2/8] vfio iommu: Remove atomicity of ref_count of pinned pages

2020-05-28 Thread Kirti Wankhede

vfio_pfn.ref_count is always updated while holding iommu->lock, using
atomic variable is overkill.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Eric Auger 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 drivers/vfio/vfio_iommu_type1.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5290c7a00bbc..fef7cd9a1747 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -112,7 +112,7 @@ struct vfio_pfn {
struct rb_node  node;
dma_addr_t  iova;   /* Device address */
unsigned long   pfn;/* Host pfn */
-   atomic_tref_count;
+   unsigned intref_count;
 };
 
 struct vfio_regions {
@@ -233,7 +233,7 @@ static int vfio_add_to_pfn_list(struct vfio_dma *dma, 
dma_addr_t iova,
 
vpfn->iova = iova;
vpfn->pfn = pfn;
-   atomic_set(>ref_count, 1);
+   vpfn->ref_count = 1;
vfio_link_pfn(dma, vpfn);
return 0;
 }
@@ -251,7 +251,7 @@ static struct vfio_pfn *vfio_iova_get_vfio_pfn(struct 
vfio_dma *dma,
struct vfio_pfn *vpfn = vfio_find_vpfn(dma, iova);
 
if (vpfn)
-   atomic_inc(>ref_count);
+   vpfn->ref_count++;
return vpfn;
 }
 
@@ -259,7 +259,8 @@ static int vfio_iova_put_vfio_pfn(struct vfio_dma *dma, 
struct vfio_pfn *vpfn)
 {
int ret = 0;
 
-   if (atomic_dec_and_test(>ref_count)) {
+   vpfn->ref_count--;
+   if (!vpfn->ref_count) {
ret = put_pfn(vpfn->pfn, dma->prot);
vfio_remove_from_pfn_list(dma, vpfn);
}
-- 
2.7.0

[PATCH Kernel v24 5/8] vfio iommu: Implementation of ioctl for dirty pages tracking

2020-05-28 Thread Kirti Wankhede

VFIO_IOMMU_DIRTY_PAGES ioctl performs three operations:
- Start dirty pages tracking while migration is active
- Stop dirty pages tracking.
- Get dirty pages bitmap. Its user space application's responsibility to
  copy content of dirty pages from source to destination during migration.

To prevent DoS attack, memory for bitmap is allocated per vfio_dma
structure. Bitmap size is calculated considering smallest supported page
size. Bitmap is allocated for all vfio_dmas when dirty logging is enabled

Bitmap is populated for already pinned pages when bitmap is allocated for
a vfio_dma with the smallest supported page size. Update bitmap from
pinning functions when tracking is enabled. When user application queries
bitmap, check if requested page size is same as page size used to
populated bitmap. If it is equal, copy bitmap, but if not equal, return
error.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Yan Zhao 

Fixed error reported by build bot by changing pgsize type from uint64_t
to size_t.
Reported-by: kbuild test robot 
---
 drivers/vfio/vfio_iommu_type1.c | 314 +++-
 1 file changed, 308 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 814c795a2543..057614c90900 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -72,6 +72,7 @@ struct vfio_iommu {
uint64_tpgsize_bitmap;
boolv2;
boolnesting;
+   booldirty_page_tracking;
 };
 
 struct vfio_domain {
@@ -92,6 +93,7 @@ struct vfio_dma {
boollock_cap;   /* capable(CAP_IPC_LOCK) */
struct task_struct  *task;
struct rb_root  pfn_list;   /* Ex-user pinned pfn list */
+   unsigned long   *bitmap;
 };
 
 struct vfio_group {
@@ -126,6 +128,19 @@ struct vfio_regions {
 #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
(!list_empty(>domain_list))
 
+#define DIRTY_BITMAP_BYTES(n)  (ALIGN(n, BITS_PER_TYPE(u64)) / BITS_PER_BYTE)
+
+/*
+ * Input argument of number of bits to bitmap_set() is unsigned integer, which
+ * further casts to signed integer for unaligned multi-bit operation,
+ * __bitmap_set().
+ * Then maximum bitmap size supported is 2^31 bits divided by 2^3 bits/byte,
+ * that is 2^28 (256 MB) which maps to 2^31 * 2^12 = 2^43 (8TB) on 4K page
+ * system.
+ */
+#define DIRTY_BITMAP_PAGES_MAX  ((u64)INT_MAX)
+#define DIRTY_BITMAP_SIZE_MAX   DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
+
 static int put_pfn(unsigned long pfn, int prot);
 
 /*
@@ -176,6 +191,80 @@ static void vfio_unlink_dma(struct vfio_iommu *iommu, 
struct vfio_dma *old)
rb_erase(>node, >dma_list);
 }
 
+
+static int vfio_dma_bitmap_alloc(struct vfio_dma *dma, size_t pgsize)
+{
+   uint64_t npages = dma->size / pgsize;
+
+   if (npages > DIRTY_BITMAP_PAGES_MAX)
+   return -EINVAL;
+
+   /*
+* Allocate extra 64 bits that are used to calculate shift required for
+* bitmap_shift_left() to manipulate and club unaligned number of pages
+* in adjacent vfio_dma ranges.
+*/
+   dma->bitmap = kvzalloc(DIRTY_BITMAP_BYTES(npages) + sizeof(u64),
+  GFP_KERNEL);
+   if (!dma->bitmap)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static void vfio_dma_bitmap_free(struct vfio_dma *dma)
+{
+   kfree(dma->bitmap);
+   dma->bitmap = NULL;
+}
+
+static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
+{
+   struct rb_node *p;
+
+   for (p = rb_first(>pfn_list); p; p = rb_next(p)) {
+   struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn, node);
+
+   bitmap_set(dma->bitmap, (vpfn->iova - dma->iova) / pgsize, 1);
+   }
+}
+
+static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
+{
+   struct rb_node *n;
+
+   for (n = rb_first(>dma_list); n; n = rb_next(n)) {
+   struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
+   int ret;
+
+   ret = vfio_dma_bitmap_alloc(dma, pgsize);
+   if (ret) {
+   struct rb_node *p;
+
+   for (p = rb_prev(n); p; p = rb_prev(p)) {
+   struct vfio_dma *dma = rb_entry(n,
+   struct vfio_dma, node);
+
+   vfio_dma_bitmap_free(dma);
+   }
+   return ret;
+   }
+   vfio_dma_populate_bitmap(dma, pgsize);
+   }
+   return 0;
+}
+
+static void vfio_dma_bitmap_free_all(struct vfio_iommu *iommu)
+{
+   struct rb_node *n;
+
+   for (n = rb_first(>dma_list); n; n = rb_next(n)) {
+   struct vfio_dma *dma = rb_entry(n, struct

[PATCH Kernel v24 1/8] vfio: UAPI for migration interface for device state

2020-05-28 Thread Kirti Wankhede

- Defined MIGRATION region type and sub-type.

- Defined vfio_device_migration_info structure which will be placed at the
  0th offset of migration region to get/set VFIO device related
  information. Defined members of structure and usage on read/write access.

- Defined device states and state transition details.

- Defined sequence to be followed while saving and resuming VFIO device.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 include/uapi/linux/vfio.h | 228 ++
 1 file changed, 228 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 015516bcfaa3..ad9bb5af3463 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -305,6 +305,7 @@ struct vfio_region_info_cap_type {
 #define VFIO_REGION_TYPE_PCI_VENDOR_MASK   (0x)
 #define VFIO_REGION_TYPE_GFX(1)
 #define VFIO_REGION_TYPE_CCW   (2)
+#define VFIO_REGION_TYPE_MIGRATION  (3)
 
 /* sub-types for VFIO_REGION_TYPE_PCI_* */
 
@@ -379,6 +380,233 @@ struct vfio_region_gfx_edid {
 /* sub-types for VFIO_REGION_TYPE_CCW */
 #define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD  (1)
 
+/* sub-types for VFIO_REGION_TYPE_MIGRATION */
+#define VFIO_REGION_SUBTYPE_MIGRATION   (1)
+
+/*
+ * The structure vfio_device_migration_info is placed at the 0th offset of
+ * the VFIO_REGION_SUBTYPE_MIGRATION region to get and set VFIO device related
+ * migration information. Field accesses from this structure are only supported
+ * at their native width and alignment. Otherwise, the result is undefined and
+ * vendor drivers should return an error.
+ *
+ * device_state: (read/write)
+ *  - The user application writes to this field to inform the vendor driver
+ *about the device state to be transitioned to.
+ *  - The vendor driver should take the necessary actions to change the
+ *device state. After successful transition to a given state, the
+ *vendor driver should return success on write(device_state, state)
+ *system call. If the device state transition fails, the vendor driver
+ *should return an appropriate -errno for the fault condition.
+ *  - On the user application side, if the device state transition fails,
+ *   that is, if write(device_state, state) returns an error, read
+ *   device_state again to determine the current state of the device from
+ *   the vendor driver.
+ *  - The vendor driver should return previous state of the device unless
+ *the vendor driver has encountered an internal error, in which case
+ *the vendor driver may report the device_state 
VFIO_DEVICE_STATE_ERROR.
+ *  - The user application must use the device reset ioctl to recover the
+ *device from VFIO_DEVICE_STATE_ERROR state. If the device is
+ *indicated to be in a valid device state by reading device_state, the
+ *user application may attempt to transition the device to any valid
+ *state reachable from the current state or terminate itself.
+ *
+ *  device_state consists of 3 bits:
+ *  - If bit 0 is set, it indicates the _RUNNING state. If bit 0 is clear,
+ *it indicates the _STOP state. When the device state is changed to
+ *_STOP, driver should stop the device before write() returns.
+ *  - If bit 1 is set, it indicates the _SAVING state, which means that the
+ *driver should start gathering device state information that will be
+ *provided to the VFIO user application to save the device's state.
+ *  - If bit 2 is set, it indicates the _RESUMING state, which means that
+ *the driver should prepare to resume the device. Data provided through
+ *the migration region should be used to resume the device.
+ *  Bits 3 - 31 are reserved for future use. To preserve them, the user
+ *  application should perform a read-modify-write operation on this
+ *  field when modifying the specified bits.
+ *
+ *  +--- _RESUMING
+ *  |+-- _SAVING
+ *  ||+- _RUNNING
+ *  |||
+ *  000b => Device Stopped, not saving or resuming
+ *  001b => Device running, which is the default state
+ *  010b => Stop the device & save the device state, stop-and-copy state
+ *  011b => Device running and save the device state, pre-copy state
+ *  100b => Device stopped and the device state is resuming
+ *  101b => Invalid state
+ *  110b => Error state
+ *  111b => Invalid state
+ *
+ * State transitions:
+ *
+ *  _RESUMING  _RUNNINGPre-copyStop-and-copy   _STOP
+ *(100b) (001b) (011b)(010b)   (000b)
+ * 0. Running or default state
+ * |
+ *
+ * 1. Normal Shutdown (optional)
+ * |->|
+ *
+ * 2. Save the state or suspend
+ *

[PATCH Kernel v24 3/8] vfio iommu: Cache pgsize_bitmap in struct vfio_iommu

2020-05-28 Thread Kirti Wankhede

Calculate and cache pgsize_bitmap when iommu->domain_list is updated
and iommu->external_domain is set for mdev device.
Add iommu->lock protection when cached pgsize_bitmap is accessed.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 drivers/vfio/vfio_iommu_type1.c | 88 +++--
 1 file changed, 49 insertions(+), 39 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index fef7cd9a1747..814c795a2543 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -69,6 +69,7 @@ struct vfio_iommu {
struct rb_root  dma_list;
struct blocking_notifier_head notifier;
unsigned intdma_avail;
+   uint64_tpgsize_bitmap;
boolv2;
boolnesting;
 };
@@ -835,15 +836,14 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, 
struct vfio_dma *dma)
iommu->dma_avail++;
 }
 
-static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
+static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu)
 {
struct vfio_domain *domain;
-   unsigned long bitmap = ULONG_MAX;
 
-   mutex_lock(>lock);
+   iommu->pgsize_bitmap = ULONG_MAX;
+
list_for_each_entry(domain, >domain_list, next)
-   bitmap &= domain->domain->pgsize_bitmap;
-   mutex_unlock(>lock);
+   iommu->pgsize_bitmap &= domain->domain->pgsize_bitmap;
 
/*
 * In case the IOMMU supports page sizes smaller than PAGE_SIZE
@@ -853,12 +853,10 @@ static unsigned long vfio_pgsize_bitmap(struct vfio_iommu 
*iommu)
 * granularity while iommu driver can use the sub-PAGE_SIZE size
 * to map the buffer.
 */
-   if (bitmap & ~PAGE_MASK) {
-   bitmap &= PAGE_MASK;
-   bitmap |= PAGE_SIZE;
+   if (iommu->pgsize_bitmap & ~PAGE_MASK) {
+   iommu->pgsize_bitmap &= PAGE_MASK;
+   iommu->pgsize_bitmap |= PAGE_SIZE;
}
-
-   return bitmap;
 }
 
 static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
@@ -869,19 +867,28 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
size_t unmapped = 0;
int ret = 0, retries = 0;
 
-   mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
+   mutex_lock(>lock);
+
+   mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1;
+
+   if (unmap->iova & mask) {
+   ret = -EINVAL;
+   goto unlock;
+   }
+
+   if (!unmap->size || unmap->size & mask) {
+   ret = -EINVAL;
+   goto unlock;
+   }
 
-   if (unmap->iova & mask)
-   return -EINVAL;
-   if (!unmap->size || unmap->size & mask)
-   return -EINVAL;
if (unmap->iova + unmap->size - 1 < unmap->iova ||
-   unmap->size > SIZE_MAX)
-   return -EINVAL;
+   unmap->size > SIZE_MAX) {
+   ret = -EINVAL;
+   goto unlock;
+   }
 
WARN_ON(mask & PAGE_MASK);
 again:
-   mutex_lock(>lock);
 
/*
 * vfio-iommu-type1 (v1) - User mappings were coalesced together to
@@ -960,6 +967,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
blocking_notifier_call_chain(>notifier,
VFIO_IOMMU_NOTIFY_DMA_UNMAP,
_unmap);
+   mutex_lock(>lock);
goto again;
}
unmapped += dma->size;
@@ -1075,24 +1083,28 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
if (map->size != size || map->vaddr != vaddr || map->iova != iova)
return -EINVAL;
 
-   mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
-
-   WARN_ON(mask & PAGE_MASK);
-
/* READ/WRITE from device perspective */
if (map->flags & VFIO_DMA_MAP_FLAG_WRITE)
prot |= IOMMU_WRITE;
if (map->flags & VFIO_DMA_MAP_FLAG_READ)
prot |= IOMMU_READ;
 
-   if (!prot || !size || (size | iova | vaddr) & mask)
-   return -EINVAL;
+   mutex_lock(>lock);
 
-   /* Don't allow IOVA or virtual address wrap */
-   if (iova + size - 1 < iova || vaddr + size - 1 < vaddr)
-   return -EINVAL;
+   mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1;
 
-   mutex_lock(>lock);
+   WARN_ON(mask & PAGE_MASK);
+
+   if (!prot || !size || (size | iova | vaddr) & mask) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   /* Don't allow IOVA or virtual address wrap */
+   if (iova + size - 1 < iova || vaddr + size - 1 < vaddr) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
 
if (vfio_find_dma(iommu, iova, size)) {

[PATCH Kernel v24 7/8] vfio iommu: Add migration capability to report supported features

2020-05-28 Thread Kirti Wankhede

Added migration capability in IOMMU info chain.
User application should check IOMMU info chain for migration capability
to use dirty page tracking feature provided by kernel module.
User application must check page sizes supported and maximum dirty
bitmap size returned by this capability structure for ioctls used to get
dirty bitmap.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 drivers/vfio/vfio_iommu_type1.c | 23 ++-
 include/uapi/linux/vfio.h   | 23 +++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 1c240d47d681..f27c29df6fc5 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2423,6 +2423,22 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu 
*iommu,
return ret;
 }
 
+static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu,
+  struct vfio_info_cap *caps)
+{
+   struct vfio_iommu_type1_info_cap_migration cap_mig;
+
+   cap_mig.header.id = VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION;
+   cap_mig.header.version = 1;
+
+   cap_mig.flags = 0;
+   /* support minimum pgsize */
+   cap_mig.pgsize_bitmap = (size_t)1 << __ffs(iommu->pgsize_bitmap);
+   cap_mig.max_dirty_bitmap_size = DIRTY_BITMAP_SIZE_MAX;
+
+   return vfio_info_add_capability(caps, _mig.header, sizeof(cap_mig));
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
   unsigned int cmd, unsigned long arg)
 {
@@ -2469,8 +2485,13 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
info.iova_pgsizes = iommu->pgsize_bitmap;
 
-   ret = vfio_iommu_iova_build_caps(iommu, );
+   ret = vfio_iommu_migration_build_caps(iommu, );
+
+   if (!ret)
+   ret = vfio_iommu_iova_build_caps(iommu, );
+
mutex_unlock(>lock);
+
if (ret)
return ret;
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index ff4b6706f7df..fde4692a6989 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1013,6 +1013,29 @@ struct vfio_iommu_type1_info_cap_iova_range {
struct  vfio_iova_range iova_ranges[];
 };
 
+/*
+ * The migration capability allows to report supported features for migration.
+ *
+ * The structures below define version 1 of this capability.
+ *
+ * The existence of this capability indicates that IOMMU kernel driver supports
+ * dirty page logging.
+ *
+ * pgsize_bitmap: Kernel driver returns bitmap of supported page sizes for 
dirty
+ * page logging.
+ * max_dirty_bitmap_size: Kernel driver returns maximum supported dirty bitmap
+ * size in bytes that can be used by user applications when getting the dirty
+ * bitmap.
+ */
+#define VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION  1
+
+struct vfio_iommu_type1_info_cap_migration {
+   struct  vfio_info_cap_header header;
+   __u32   flags;
+   __u64   pgsize_bitmap;
+   __u64   max_dirty_bitmap_size;  /* in bytes */
+};
+
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
 
 /**
-- 
2.7.0

[PATCH Kernel v24 4/8] vfio iommu: Add ioctl definition for dirty pages tracking

2020-05-28 Thread Kirti Wankhede

IOMMU container maintains a list of all pages pinned by vfio_pin_pages API.
All pages pinned by vendor driver through this API should be considered as
dirty during migration. When container consists of IOMMU capable device and
all pages are pinned and mapped, then all pages are marked dirty.
Added support to start/stop dirtied pages tracking and to get bitmap of all
dirtied pages for requested IO virtual address range.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
Reviewed-by: Cornelia Huck 
Reviewed-by: Yan Zhao 
---
 include/uapi/linux/vfio.h | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index ad9bb5af3463..009a8c80079d 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1033,6 +1033,12 @@ struct vfio_iommu_type1_dma_map {
 
 #define VFIO_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 13)
 
+struct vfio_bitmap {
+   __u64pgsize;/* page size for bitmap in bytes */
+   __u64size;  /* in bytes */
+   __u64 __user *data; /* one bit per page */
+};
+
 /**
  * VFIO_IOMMU_UNMAP_DMA - _IOWR(VFIO_TYPE, VFIO_BASE + 14,
  * struct vfio_dma_unmap)
@@ -1059,6 +1065,57 @@ struct vfio_iommu_type1_dma_unmap {
 #define VFIO_IOMMU_ENABLE  _IO(VFIO_TYPE, VFIO_BASE + 15)
 #define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16)
 
+/**
+ * VFIO_IOMMU_DIRTY_PAGES - _IOWR(VFIO_TYPE, VFIO_BASE + 17,
+ * struct vfio_iommu_type1_dirty_bitmap)
+ * IOCTL is used for dirty pages logging.
+ * Caller should set flag depending on which operation to perform, details as
+ * below:
+ *
+ * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_START flag set, instructs
+ * the IOMMU driver to log pages that are dirtied or potentially dirtied by
+ * the device; designed to be used when a migration is in progress. Dirty pages
+ * are logged until logging is disabled by user application by calling the 
IOCTL
+ * with VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP flag.
+ *
+ * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP flag set, instructs
+ * the IOMMU driver to stop logging dirtied pages.
+ *
+ * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP flag set
+ * returns the dirty pages bitmap for IOMMU container for a given IOVA range.
+ * The user must specify the IOVA range and the pgsize through the structure
+ * vfio_iommu_type1_dirty_bitmap_get in the data[] portion. This interface
+ * supports getting a bitmap of the smallest supported pgsize only and can be
+ * modified in future to get a bitmap of any specified supported pgsize. The
+ * user must provide a zeroed memory area for the bitmap memory and specify its
+ * size in bitmap.size. One bit is used to represent one page consecutively
+ * starting from iova offset. The user should provide page size in 
bitmap.pgsize
+ * field. A bit set in the bitmap indicates that the page at that offset from
+ * iova is dirty. The caller must set argsz to a value including the size of
+ * structure vfio_iommu_type1_dirty_bitmap_get, but excluding the size of the
+ * actual bitmap. If dirty pages logging is not enabled, an error will be
+ * returned.
+ *
+ * Only one of the flags _START, _STOP and _GET may be specified at a time.
+ *
+ */
+struct vfio_iommu_type1_dirty_bitmap {
+   __u32argsz;
+   __u32flags;
+#define VFIO_IOMMU_DIRTY_PAGES_FLAG_START  (1 << 0)
+#define VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP   (1 << 1)
+#define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP (1 << 2)
+   __u8 data[];
+};
+
+struct vfio_iommu_type1_dirty_bitmap_get {
+   __u64  iova;/* IO virtual address */
+   __u64  size;/* Size of iova range */
+   struct vfio_bitmap bitmap;
+};
+
+#define VFIO_IOMMU_DIRTY_PAGES _IO(VFIO_TYPE, VFIO_BASE + 17)
+
 /*  Additional API for SPAPR TCE (Server POWERPC) IOMMU  */
 
 /*
-- 
2.7.0

[PATCH Kernel v24 0/8] Add UAPIs to support migration for VFIO devices

2020-05-28 Thread Kirti Wankhede

Hi,

This patch set adds:
* IOCTL VFIO_IOMMU_DIRTY_PAGES to get dirty pages bitmap with
  respect to IOMMU container rather than per device. All pages pinned by
  vendor driver through vfio_pin_pages external API has to be marked as
  dirty during  migration. When IOMMU capable device is present in the
  container and all pages are pinned and mapped, then all pages are marked
  dirty.
  When there are CPU writes, CPU dirty page tracking can identify dirtied
  pages, but any page pinned by vendor driver can also be written by
  device. As of now there is no device which has hardware support for
  dirty page tracking. So all pages which are pinned should be considered
  as dirty.
  This ioctl is also used to start/stop dirty pages tracking for pinned and
  unpinned pages while migration is active.

* Updated IOCTL VFIO_IOMMU_UNMAP_DMA to get dirty pages bitmap before
  unmapping IO virtual address range.
  With vIOMMU, during pre-copy phase of migration, while CPUs are still
  running, IO virtual address unmap can happen while device still keeping
  reference of guest pfns. Those pages should be reported as dirty before
  unmap, so that VFIO user space application can copy content of those
  pages from source to destination.

* Patch 8 detect if IOMMU capable device driver is smart to report pages
  to be marked dirty by pinning pages using vfio_pin_pages() API.


Yet TODO:
Since there is no device which has hardware support for system memmory
dirty bitmap tracking, right now there is no other API from vendor driver
to VFIO IOMMU module to report dirty pages. In future, when such hardware
support will be implemented, an API will be required such that vendor
driver could report dirty pages to VFIO module during migration phases.

v23 -> v24
- Fixed nit picks by Cornelia
- Fixed warning reported by test robot.

v22 -> v23
- Fixed issue reported by Yan
https://lore.kernel.org/kvm/97977ede-3c5b-c5a5-7858-7eecd7dd5...@nvidia.com/
- Fixed nit picks suggested by Cornelia

v21 -> v22
- Fixed issue raised by Alex :
https://lore.kernel.org/kvm/20200515163307.72951...@w520.home/

v20 -> v21
- Added checkin for GET_BITMAP ioctl for vfio_dma boundaries.
- Updated unmap ioctl function - as suggested by Alex.
- Updated comments in DIRTY_TRACKING ioctl definition - as suggested by
  Cornelia.

v19 -> v20
- Fixed ioctl to get dirty bitmap to get bitmap of multiple vfio_dmas
- Fixed unmap ioctl to get dirty bitmap of multiple vfio_dmas.
- Removed flag definition from migration capability.

v18 -> v19
- Updated migration capability with supported page sizes bitmap for dirty
  page tracking and  maximum bitmap size supported by kernel module.
- Added patch to calculate and cache pgsize_bitmap when iommu->domain_list
  is updated.
- Removed extra buffers added in previous version for bitmap manipulation
  and optimised the code.

v17 -> v18
- Add migration capability to the capability chain for VFIO_IOMMU_GET_INFO
  ioctl
- Updated UMAP_DMA ioctl to return bitmap of multiple vfio_dma

v16 -> v17
- Fixed errors reported by kbuild test robot  on i386

v15 -> v16
- Minor edits and nit picks (Auger Eric)
- On copying bitmap to user, re-populated bitmap only for pinned pages,
  excluding unmapped pages and CPU dirtied pages.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
  https://lkml.org/lkml/2020/3/12/1255

v14 -> v15
- Minor edits and nit picks.
- In the verification of user allocated bitmap memory, added check of
   maximum size.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
  https://lkml.org/lkml/2020/3/12/1255

v13 -> v14
- Added struct vfio_bitmap to kabi. updated structure
  vfio_iommu_type1_dirty_bitmap_get and vfio_iommu_type1_dma_unmap.
- All small changes suggested by Alex.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
  https://lkml.org/lkml/2020/3/12/1255

v12 -> v13
- Changed bitmap allocation in vfio_iommu_type1 to per vfio_dma
- Changed VFIO_IOMMU_DIRTY_PAGES ioctl behaviour to be per vfio_dma range.
- Changed vfio_iommu_type1_dirty_bitmap structure to have separate data
  field.

v11 -> v12
- Changed bitmap allocation in vfio_iommu_type1.
- Remove atomicity of ref_count.
- Updated comments for migration device state structure about error
  reporting.
- Nit picks from v11 reviews

v10 -> v11
- Fix pin pages API to free vpfn if it is marked as unpinned tracking page.
- Added proposal to detect if IOMMU capable device calls external pin pages
  API to mark pages dirty.
- Nit picks from v10 reviews

v9 -> v10:
- Updated existing VFIO_IOMMU_UNMAP_DMA ioctl to get dirty pages bitmap
  during unmap while migration is active
- Added flag in VFIO_IOMMU_GET_INFO to indicate driver support dirty page
  tracking.
- If iommu_mapped, mark all pages dirty.
- Added unpinned pages tracking while migration is active.
- Updated comments for migration device state structure with bit
  combination table and state transition details.

v8 -> v9:
- Split patch set in

Re: [PATCH v2 0/4] microvm: memory config tweaks

2020-05-28 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20200528134035.32025-1-kra...@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  x86_64-softmmu/softmmu/main.o
  CC  x86_64-softmmu/gdbstub-xml.o
  CC  x86_64-softmmu/trace/generated-helpers.o
/tmp/qemu-test/src/hw/i386/xen/xen-hvm.c:206:34: error: use of undeclared 
identifier 'X86_MACHINE_MAX_RAM_BELOW_4G'
 X86_MACHINE_MAX_RAM_BELOW_4G,
 ^
1 error generated.
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/i386/xen/xen-hvm.o] Error 1
make[1]: *** Waiting for unfinished jobs
make: *** [Makefile:527: x86_64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in 
sys.exit(main())
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=0bac4918513149f2a4717d0ab9d331ee', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 
'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 
'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', 
'-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-sdwodinx/src/docker-src.2020-05-28-16.59.46.1:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=0bac4918513149f2a4717d0ab9d331ee
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-sdwodinx/src'
make: *** [docker-run-test-debug@fedora] Error 2

real4m15.596s
user0m7.249s


The full log is available at
http://patchew.org/logs/20200528134035.32025-1-kra...@redhat.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH Kernel v23 0/8] Add UAPIs to support migration for VFIO devices

2020-05-28 Thread Kirti Wankhede




On 5/28/2020 10:17 AM, Yan Zhao wrote:


The whole series works for us in general:
 Reviewed-by: Yan Zhao 


Thanks.

Kirti




On Wed, May 20, 2020 at 11:38:00PM +0530, Kirti Wankhede wrote:

Hi,

This patch set adds:
* IOCTL VFIO_IOMMU_DIRTY_PAGES to get dirty pages bitmap with
   respect to IOMMU container rather than per device. All pages pinned by
   vendor driver through vfio_pin_pages external API has to be marked as
   dirty during  migration. When IOMMU capable device is present in the
   container and all pages are pinned and mapped, then all pages are marked
   dirty.
   When there are CPU writes, CPU dirty page tracking can identify dirtied
   pages, but any page pinned by vendor driver can also be written by
   device. As of now there is no device which has hardware support for
   dirty page tracking. So all pages which are pinned should be considered
   as dirty.
   This ioctl is also used to start/stop dirty pages tracking for pinned and
   unpinned pages while migration is active.

* Updated IOCTL VFIO_IOMMU_UNMAP_DMA to get dirty pages bitmap before
   unmapping IO virtual address range.
   With vIOMMU, during pre-copy phase of migration, while CPUs are still
   running, IO virtual address unmap can happen while device still keeping
   reference of guest pfns. Those pages should be reported as dirty before
   unmap, so that VFIO user space application can copy content of those
   pages from source to destination.

* Patch 8 detect if IOMMU capable device driver is smart to report pages
   to be marked dirty by pinning pages using vfio_pin_pages() API.


Yet TODO:
Since there is no device which has hardware support for system memmory
dirty bitmap tracking, right now there is no other API from vendor driver
to VFIO IOMMU module to report dirty pages. In future, when such hardware
support will be implemented, an API will be required such that vendor
driver could report dirty pages to VFIO module during migration phases.

v22 -> v23
- Fixed issue reported by Yan
https://lore.kernel.org/kvm/97977ede-3c5b-c5a5-7858-7eecd7dd5...@nvidia.com/
- Fixed nit picks suggested by Cornelia

v21 -> v22
- Fixed issue raised by Alex :
https://lore.kernel.org/kvm/20200515163307.72951...@w520.home/

v20 -> v21
- Added checkin for GET_BITMAP ioctl for vfio_dma boundaries.
- Updated unmap ioctl function - as suggested by Alex.
- Updated comments in DIRTY_TRACKING ioctl definition - as suggested by
   Cornelia.

v19 -> v20
- Fixed ioctl to get dirty bitmap to get bitmap of multiple vfio_dmas
- Fixed unmap ioctl to get dirty bitmap of multiple vfio_dmas.
- Removed flag definition from migration capability.

v18 -> v19
- Updated migration capability with supported page sizes bitmap for dirty
   page tracking and  maximum bitmap size supported by kernel module.
- Added patch to calculate and cache pgsize_bitmap when iommu->domain_list
   is updated.
- Removed extra buffers added in previous version for bitmap manipulation
   and optimised the code.

v17 -> v18
- Add migration capability to the capability chain for VFIO_IOMMU_GET_INFO
   ioctl
- Updated UMAP_DMA ioctl to return bitmap of multiple vfio_dma

v16 -> v17
- Fixed errors reported by kbuild test robot  on i386

v15 -> v16
- Minor edits and nit picks (Auger Eric)
- On copying bitmap to user, re-populated bitmap only for pinned pages,
   excluding unmapped pages and CPU dirtied pages.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
   https://lkml.org/lkml/2020/3/12/1255

v14 -> v15
- Minor edits and nit picks.
- In the verification of user allocated bitmap memory, added check of
maximum size.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
   https://lkml.org/lkml/2020/3/12/1255

v13 -> v14
- Added struct vfio_bitmap to kabi. updated structure
   vfio_iommu_type1_dirty_bitmap_get and vfio_iommu_type1_dma_unmap.
- All small changes suggested by Alex.
- Patches are on tag: next-20200318 and 1-3 patches from Yan's series
   https://lkml.org/lkml/2020/3/12/1255

v12 -> v13
- Changed bitmap allocation in vfio_iommu_type1 to per vfio_dma
- Changed VFIO_IOMMU_DIRTY_PAGES ioctl behaviour to be per vfio_dma range.
- Changed vfio_iommu_type1_dirty_bitmap structure to have separate data
   field.

v11 -> v12
- Changed bitmap allocation in vfio_iommu_type1.
- Remove atomicity of ref_count.
- Updated comments for migration device state structure about error
   reporting.
- Nit picks from v11 reviews

v10 -> v11
- Fix pin pages API to free vpfn if it is marked as unpinned tracking page.
- Added proposal to detect if IOMMU capable device calls external pin pages
   API to mark pages dirty.
- Nit picks from v10 reviews

v9 -> v10:
- Updated existing VFIO_IOMMU_UNMAP_DMA ioctl to get dirty pages bitmap
   during unmap while migration is active
- Added flag in VFIO_IOMMU_GET_INFO to indicate driver support dirty page
   tracking.
- If iommu_mapped, mark all pages dirty.
- Added unpinned pages tracking while migration

Re: [PATCH v2 0/4] microvm: memory config tweaks

2020-05-28 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20200528134035.32025-1-kra...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  x86_64-softmmu/target/i386/translate.o
  CC  aarch64-softmmu/hw/arm/omap1.o
/tmp/qemu-test/src/hw/i386/xen/xen-hvm.c: In function 'xen_ram_init':
/tmp/qemu-test/src/hw/i386/xen/xen-hvm.c:206:34: error: 
'X86_MACHINE_MAX_RAM_BELOW_4G' undeclared (first use in this function)
  X86_MACHINE_MAX_RAM_BELOW_4G,
  ^
/tmp/qemu-test/src/hw/i386/xen/xen-hvm.c:206:34: note: each undeclared 
identifier is reported only once for each function it appears in
make[1]: *** [hw/i386/xen/xen-hvm.o] Error 1
  CC  aarch64-softmmu/hw/arm/omap2.o
make[1]: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/hw/arm/strongarm.o
---
  CC  aarch64-softmmu/target/arm/gdbstub64.o
  CC  aarch64-softmmu/target/arm/machine.o
  CC  aarch64-softmmu/target/arm/arch_dump.o
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs
  CC  aarch64-softmmu/target/arm/monitor.o
  CC  aarch64-softmmu/target/arm/arm-powerctl.o
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=74ab42ef34594fdb988f724e8efaa96d', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-fky5lt0a/src/docker-src.2020-05-28-16.55.24.5056:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=74ab42ef34594fdb988f724e8efaa96d
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-fky5lt0a/src'
make: *** [docker-run-test-quick@centos7] Error 2

real3m8.799s
user0m4.488s


The full log is available at
http://patchew.org/logs/20200528134035.32025-1-kra...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: USB pass-through problems

2020-05-28 Thread BALATON Zoltan


On Thu, 28 May 2020, Gerd Hoffmann wrote:

#2  0x7f23e8bfbb13 in libusb_handle_events_timeout_completed () at 
/lib64/libusb-1.0.so.0
#3  0x55e09854b7da in usb_host_abort_xfers (s=0x55e09b036dd0) at 
hw/usb/host-libusb.c:963



Hmm, does reverting 76d0a9362c6a6a7d88aa18c84c4186c9107ecaef change
behavior?


Yes it does. Reverting that patch fixes the problem, no hang and device
reconnects without problem.


Hmm.  Looks like an libusb bug to me, it seems to not call the
completion callback for the canceled transfers (which it should do
according to the docs), so qemu waits for this to happen forever.

We can certainly add a limit here (see below), question is how to
handle the canceled but not completed transfers then.  I suspect
we have to leak them to make sure we don't get use-after-free
access from libusb ...


Also works,

Tested-by: BALATON Zoltan 

Got only one "usb_host_abort_xfers: leaking usb request" message.

Regards,
BALATON Zoltan


cheers,
 Gerd

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index e28441379d99..4c3b5b140d9d 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -944,30 +944,45 @@ fail:
libusb_close(s->dh);
s->dh = NULL;
s->dev = NULL;
}
return -1;
}

static void usb_host_abort_xfers(USBHostDevice *s)
{
USBHostRequest *r, *rtmp;
+int limit = 100;

QTAILQ_FOREACH_SAFE(r, >requests, next, rtmp) {
usb_host_req_abort(r);
}

while (QTAILQ_FIRST(>requests) != NULL) {
struct timeval tv;
memset(, 0, sizeof(tv));
tv.tv_usec = 2500;
libusb_handle_events_timeout(ctx, );
+if (--limit == 0) {
+/*
+ * Don't wait forever for libusb calling the complete
+ * callback (which will unlink and free the request).
+ *
+ * Leaking memory here, to make sure libusb will not
+ * access memory which we have released already.
+ */
+QTAILQ_FOREACH_SAFE(r, >requests, next, rtmp) {
+fprintf(stderr, "%s: leaking usb request %p\n", __func__, r);
+QTAILQ_REMOVE(>requests, r, next);
+}
+return;
+}
}
}

static int usb_host_close(USBHostDevice *s)
{
USBDevice *udev = USB_DEVICE(s);

if (s->dh == NULL) {
return -1;
}

[PATCH 13/13] i386: hvf: Drop HVFX86EmulatorState

2020-05-28 Thread Roman Bolshakov

Signed-off-by: Roman Bolshakov 
---
 include/qemu/typedefs.h | 1 -
 target/i386/cpu.h   | 1 -
 target/i386/hvf/hvf.c   | 1 -
 target/i386/hvf/x86.h   | 4 
 4 files changed, 7 deletions(-)

diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index ecf3cde26c..6ce0356f2c 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -51,7 +51,6 @@ typedef struct FWCfgIoState FWCfgIoState;
 typedef struct FWCfgMemState FWCfgMemState;
 typedef struct FWCfgState FWCfgState;
 typedef struct HostMemoryBackend HostMemoryBackend;
-typedef struct HVFX86EmulatorState HVFX86EmulatorState;
 typedef struct I2CBus I2CBus;
 typedef struct I2SCodec I2SCodec;
 typedef struct IOMMUMemoryRegion IOMMUMemoryRegion;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index be44e19154..abf9d10d86 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1594,7 +1594,6 @@ typedef struct CPUX86State {
 #if defined(CONFIG_HVF)
 hvf_lazy_flags hvf_lflags;
 void *hvf_mmio_buf;
-HVFX86EmulatorState *hvf_emul;
 #endif
 
 uint64_t mcg_cap;
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 57696c46c7..be016b951a 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -568,7 +568,6 @@ int hvf_init_vcpu(CPUState *cpu)
 
 hvf_state->hvf_caps = g_new0(struct hvf_vcpu_caps, 1);
 env->hvf_mmio_buf = g_new(char, 4096);
-env->hvf_emul = g_new0(HVFX86EmulatorState, 1);
 
 r = hv_vcpu_create((hv_vcpuid_t *)>hvf_fd, HV_VCPU_DEFAULT);
 cpu->vcpu_dirty = 1;
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 483fcea762..bacade7b65 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -228,10 +228,6 @@ typedef struct x68_segment_selector {
 };
 } __attribute__ ((__packed__)) x68_segment_selector;
 
-/* Definition of hvf_x86_state is here */
-struct HVFX86EmulatorState {
-};
-
 /* useful register access  macros */
 #define x86_reg(cpu, reg) ((x86_register *) >regs[reg])
 
-- 
2.26.1

[PATCH 08/13] i386: hvf: Drop rflags from HVFX86EmulatorState

2020-05-28 Thread Roman Bolshakov

HVFX86EmulatorState carries it's own copy of x86 flags. It can be
dropped in favor of eflags in generic CPUX86State.

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/hvf.c   |  5 ++---
 target/i386/hvf/x86.c   |  2 +-
 target/i386/hvf/x86.h   | 42 -
 target/i386/hvf/x86_emu.c   |  6 +++---
 target/i386/hvf/x86_flags.c | 24 ++---
 target/i386/hvf/x86_task.c  |  6 +++---
 target/i386/hvf/x86hvf.c|  6 +++---
 7 files changed, 24 insertions(+), 67 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 416a6fae7c..4cee496d71 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -723,8 +723,7 @@ int hvf_vcpu_exec(CPUState *cpu)
 
 hvf_store_events(cpu, ins_len, idtvec_info);
 rip = rreg(cpu->hvf_fd, HV_X86_RIP);
-RFLAGS(env) = rreg(cpu->hvf_fd, HV_X86_RFLAGS);
-env->eflags = RFLAGS(env);
+env->eflags = rreg(cpu->hvf_fd, HV_X86_RFLAGS);
 
 qemu_mutex_lock_iothread();
 
@@ -736,7 +735,7 @@ int hvf_vcpu_exec(CPUState *cpu)
 case EXIT_REASON_HLT: {
 macvm_set_rip(cpu, rip + ins_len);
 if (!((cpu->interrupt_request & CPU_INTERRUPT_HARD) &&
-(EFLAGS(env) & IF_MASK))
+(env->eflags & IF_MASK))
 && !(cpu->interrupt_request & CPU_INTERRUPT_NMI) &&
 !(idtvec_info & VMCS_IDT_VEC_VALID)) {
 cpu->halted = 1;
diff --git a/target/i386/hvf/x86.c b/target/i386/hvf/x86.c
index 3afcedc7fc..7ebb5b45bd 100644
--- a/target/i386/hvf/x86.c
+++ b/target/i386/hvf/x86.c
@@ -131,7 +131,7 @@ bool x86_is_v8086(struct CPUState *cpu)
 {
 X86CPU *x86_cpu = X86_CPU(cpu);
 CPUX86State *env = _cpu->env;
-return x86_is_protected(cpu) && (RFLAGS(env) & RFLAGS_VM);
+return x86_is_protected(cpu) && (env->eflags & RFLAGS_VM);
 }
 
 bool x86_is_long_mode(struct CPUState *cpu)
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 411e4b6599..e309b8f203 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -62,44 +62,6 @@ typedef enum x86_rflags {
 RFLAGS_ID   = (1L << 21),
 } x86_rflags;
 
-/* rflags register */
-typedef struct x86_reg_flags {
-union {
-struct {
-uint64_t rflags;
-};
-struct {
-uint32_t eflags;
-uint32_t hi32_unused1;
-};
-struct {
-uint32_t cf:1;
-uint32_t unused1:1;
-uint32_t pf:1;
-uint32_t unused2:1;
-uint32_t af:1;
-uint32_t unused3:1;
-uint32_t zf:1;
-uint32_t sf:1;
-uint32_t tf:1;
-uint32_t ief:1;
-uint32_t df:1;
-uint32_t of:1;
-uint32_t iopl:2;
-uint32_t nt:1;
-uint32_t unused4:1;
-uint32_t rf:1;
-uint32_t vm:1;
-uint32_t ac:1;
-uint32_t vif:1;
-uint32_t vip:1;
-uint32_t id:1;
-uint32_t unused5:10;
-uint32_t hi32_unused2;
-};
-};
-} __attribute__ ((__packed__)) x86_reg_flags;
-
 typedef enum x86_reg_cr0 {
 CR0_PE =(1L << 0),
 CR0_MP =(1L << 1),
@@ -294,15 +256,11 @@ typedef struct lazy_flags {
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
 struct x86_register regs[16];
-struct x86_reg_flags   rflags;
 struct lazy_flags   lflags;
 uint8_t mmio_buf[4096];
 };
 
 /* useful register access  macros */
-#define RFLAGS(cpu) (cpu->hvf_emul->rflags.rflags)
-#define EFLAGS(cpu) (cpu->hvf_emul->rflags.eflags)
-
 #define RRX(cpu, reg) (cpu->hvf_emul->regs[reg].rrx)
 #define RAX(cpu)RRX(cpu, R_EAX)
 #define RCX(cpu)RRX(cpu, R_ECX)
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 0efd9f9928..04fac64e72 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -459,7 +459,7 @@ static inline void string_increment_reg(struct CPUX86State 
*env, int reg,
 struct x86_decode *decode)
 {
 target_ulong val = read_reg(env, reg, decode->addressing_size);
-if (env->hvf_emul->rflags.df) {
+if (env->eflags & DF_MASK) {
 val -= decode->operand_size;
 } else {
 val += decode->operand_size;
@@ -1432,7 +1432,7 @@ void load_regs(struct CPUState *cpu)
 RRX(env, i) = rreg(cpu->hvf_fd, HV_X86_RAX + i);
 }
 
-RFLAGS(env) = rreg(cpu->hvf_fd, HV_X86_RFLAGS);
+env->eflags = rreg(cpu->hvf_fd, HV_X86_RFLAGS);
 rflags_to_lflags(env);
 env->eip = rreg(cpu->hvf_fd, HV_X86_RIP);
 }
@@ -1456,7 +1456,7 @@ void store_regs(struct CPUState *cpu)
 }
 
 lflags_to_rflags(env);
-wreg(cpu->hvf_fd, HV_X86_RFLAGS, RFLAGS(env));
+wreg(cpu->hvf_fd, HV_X86_RFLAGS, env->eflags);
 macvm_set_rip(cpu, env->eip);
 }
 
diff --git a/target/i386/hvf/x86_flags.c

[PATCH 09/13] i386: hvf: Drop copy of RFLAGS defines

2020-05-28 Thread Roman Bolshakov

Use the ones provided in target/i386/cpu.h instead.

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/x86.c|  2 +-
 target/i386/hvf/x86.h| 20 
 target/i386/hvf/x86_decode.c | 16 +++-
 target/i386/hvf/x86_task.c   |  2 +-
 4 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/target/i386/hvf/x86.c b/target/i386/hvf/x86.c
index 7ebb5b45bd..fdb11c8db9 100644
--- a/target/i386/hvf/x86.c
+++ b/target/i386/hvf/x86.c
@@ -131,7 +131,7 @@ bool x86_is_v8086(struct CPUState *cpu)
 {
 X86CPU *x86_cpu = X86_CPU(cpu);
 CPUX86State *env = _cpu->env;
-return x86_is_protected(cpu) && (env->eflags & RFLAGS_VM);
+return x86_is_protected(cpu) && (env->eflags & VM_MASK);
 }
 
 bool x86_is_long_mode(struct CPUState *cpu)
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index e309b8f203..f0d03faff9 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -42,26 +42,6 @@ typedef struct x86_register {
 };
 } __attribute__ ((__packed__)) x86_register;
 
-typedef enum x86_rflags {
-RFLAGS_CF   = (1L << 0),
-RFLAGS_PF   = (1L << 2),
-RFLAGS_AF   = (1L << 4),
-RFLAGS_ZF   = (1L << 6),
-RFLAGS_SF   = (1L << 7),
-RFLAGS_TF   = (1L << 8),
-RFLAGS_IF   = (1L << 9),
-RFLAGS_DF   = (1L << 10),
-RFLAGS_OF   = (1L << 11),
-RFLAGS_IOPL = (3L << 12),
-RFLAGS_NT   = (1L << 14),
-RFLAGS_RF   = (1L << 16),
-RFLAGS_VM   = (1L << 17),
-RFLAGS_AC   = (1L << 18),
-RFLAGS_VIF  = (1L << 19),
-RFLAGS_VIP  = (1L << 20),
-RFLAGS_ID   = (1L << 21),
-} x86_rflags;
-
 typedef enum x86_reg_cr0 {
 CR0_PE =(1L << 0),
 CR0_MP =(1L << 1),
diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index d881542181..34c5e3006c 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -697,15 +697,13 @@ static void decode_db_4(CPUX86State *env, struct 
x86_decode *decode)
 
 
 #define RFLAGS_MASK_NONE0
-#define RFLAGS_MASK_OSZAPC  (RFLAGS_OF | RFLAGS_SF | RFLAGS_ZF | RFLAGS_AF | \
- RFLAGS_PF | RFLAGS_CF)
-#define RFLAGS_MASK_LAHF(RFLAGS_SF | RFLAGS_ZF | RFLAGS_AF | RFLAGS_PF | \
- RFLAGS_CF)
-#define RFLAGS_MASK_CF  (RFLAGS_CF)
-#define RFLAGS_MASK_IF  (RFLAGS_IF)
-#define RFLAGS_MASK_TF  (RFLAGS_TF)
-#define RFLAGS_MASK_DF  (RFLAGS_DF)
-#define RFLAGS_MASK_ZF  (RFLAGS_ZF)
+#define RFLAGS_MASK_OSZAPC  (CC_O | CC_S | CC_Z | CC_A | CC_P | CC_C)
+#define RFLAGS_MASK_LAHF(CC_S | CC_Z | CC_A | CC_P | CC_C)
+#define RFLAGS_MASK_CF  (CC_C)
+#define RFLAGS_MASK_IF  (IF_MASK)
+#define RFLAGS_MASK_TF  (TF_MASK)
+#define RFLAGS_MASK_DF  (DF_MASK)
+#define RFLAGS_MASK_ZF  (CC_Z)
 
 struct decode_tbl _1op_inst[] = {
 {0x0, X86_DECODE_CMD_ADD, 1, true, decode_modrm_rm, decode_modrm_reg, NULL,
diff --git a/target/i386/hvf/x86_task.c b/target/i386/hvf/x86_task.c
index 6ea8508946..6f04478b3a 100644
--- a/target/i386/hvf/x86_task.c
+++ b/target/i386/hvf/x86_task.c
@@ -158,7 +158,7 @@ void vmx_handle_task_switch(CPUState *cpu, 
x68_segment_selector tss_sel, int rea
 }
 
 if (reason == TSR_IRET)
-env->eflags &= ~RFLAGS_NT;
+env->eflags &= ~NT_MASK;
 
 if (reason != TSR_CALL && reason != TSR_IDT_GATE)
 old_tss_sel.sel = 0x;
-- 
2.26.1

[PATCH 12/13] i386: hvf: Move mmio_buf into CPUX86State

2020-05-28 Thread Roman Bolshakov

There's no similar field in CPUX86State, but it's needed for MMIO traps.

Signed-off-by: Roman Bolshakov 
---
 target/i386/cpu.h |  1 +
 target/i386/hvf/hvf.c |  5 +
 target/i386/hvf/x86.h |  1 -
 target/i386/hvf/x86_emu.c | 12 ++--
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7e6566565a..be44e19154 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1593,6 +1593,7 @@ typedef struct CPUX86State {
 #endif
 #if defined(CONFIG_HVF)
 hvf_lazy_flags hvf_lflags;
+void *hvf_mmio_buf;
 HVFX86EmulatorState *hvf_emul;
 #endif
 
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 4cee496d71..57696c46c7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -533,7 +533,11 @@ void hvf_reset_vcpu(CPUState *cpu) {
 
 void hvf_vcpu_destroy(CPUState *cpu)
 {
+X86CPU *x86_cpu = X86_CPU(cpu);
+CPUX86State *env = _cpu->env;
+
 hv_return_t ret = hv_vcpu_destroy((hv_vcpuid_t)cpu->hvf_fd);
+g_free(env->hvf_mmio_buf);
 assert_hvf_ok(ret);
 }
 
@@ -563,6 +567,7 @@ int hvf_init_vcpu(CPUState *cpu)
 init_decoder();
 
 hvf_state->hvf_caps = g_new0(struct hvf_vcpu_caps, 1);
+env->hvf_mmio_buf = g_new(char, 4096);
 env->hvf_emul = g_new0(HVFX86EmulatorState, 1);
 
 r = hv_vcpu_create((hv_vcpuid_t *)>hvf_fd, HV_VCPU_DEFAULT);
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 2363616c07..483fcea762 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -230,7 +230,6 @@ typedef struct x68_segment_selector {
 
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-uint8_t mmio_buf[4096];
 };
 
 /* useful register access  macros */
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 1ad2c30e16..d3e289ed87 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -187,8 +187,8 @@ void write_val_ext(struct CPUX86State *env, target_ulong 
ptr, target_ulong val,
 
 uint8_t *read_mmio(struct CPUX86State *env, target_ulong ptr, int bytes)
 {
-vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, ptr, bytes);
-return env->hvf_emul->mmio_buf;
+vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, ptr, bytes);
+return env->hvf_mmio_buf;
 }
 
 
@@ -489,9 +489,9 @@ static void exec_ins_single(struct CPUX86State *env, struct 
x86_decode *decode)
 target_ulong addr = linear_addr_size(env_cpu(env), RDI(env),
  decode->addressing_size, R_ES);
 
-hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 0,
+hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 0,
   decode->operand_size, 1);
-vmx_write_mem(env_cpu(env), addr, env->hvf_emul->mmio_buf,
+vmx_write_mem(env_cpu(env), addr, env->hvf_mmio_buf,
   decode->operand_size);
 
 string_increment_reg(env, R_EDI, decode);
@@ -512,9 +512,9 @@ static void exec_outs_single(struct CPUX86State *env, 
struct x86_decode *decode)
 {
 target_ulong addr = decode_linear_addr(env, decode, RSI(env), R_DS);
 
-vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, addr,
+vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, addr,
  decode->operand_size);
-hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 1,
+hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 1,
   decode->operand_size, 1);
 
 string_increment_reg(env, R_ESI, decode);
-- 
2.26.1

[PATCH 11/13] i386: hvf: Move lazy_flags into CPUX86State

2020-05-28 Thread Roman Bolshakov

The lazy flags are still needed for instruction decoder.

Signed-off-by: Roman Bolshakov 
---
 include/sysemu/hvf.h|  7 +
 target/i386/cpu.h   |  2 ++
 target/i386/hvf/x86.h   |  6 
 target/i386/hvf/x86_flags.c | 57 ++---
 4 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h
index cf579e1592..41f5470c96 100644
--- a/include/sysemu/hvf.h
+++ b/include/sysemu/hvf.h
@@ -15,9 +15,16 @@
 
 extern bool hvf_allowed;
 #ifdef CONFIG_HVF
+#include "exec/cpu-defs.h"
+
 uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx,
  int reg);
 #define hvf_enabled() (hvf_allowed)
+
+typedef struct hvf_lazy_flags {
+target_ulong result;
+target_ulong auxbits;
+} hvf_lazy_flags;
 #else
 #define hvf_enabled() 0
 #define hvf_get_supported_cpuid(func, idx, reg) 0
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 408392dbf6..7e6566565a 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -20,6 +20,7 @@
 #ifndef I386_CPU_H
 #define I386_CPU_H
 
+#include "sysemu/hvf.h"
 #include "sysemu/tcg.h"
 #include "cpu-qom.h"
 #include "hyperv-proto.h"
@@ -1591,6 +1592,7 @@ typedef struct CPUX86State {
 struct kvm_nested_state *nested_state;
 #endif
 #if defined(CONFIG_HVF)
+hvf_lazy_flags hvf_lflags;
 HVFX86EmulatorState *hvf_emul;
 #endif
 
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 6048b5cc74..2363616c07 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -228,14 +228,8 @@ typedef struct x68_segment_selector {
 };
 } __attribute__ ((__packed__)) x68_segment_selector;
 
-typedef struct lazy_flags {
-target_ulong result;
-target_ulong auxbits;
-} lazy_flags;
-
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-struct lazy_flags   lflags;
 uint8_t mmio_buf[4096];
 };
 
diff --git a/target/i386/hvf/x86_flags.c b/target/i386/hvf/x86_flags.c
index 1152cd7234..5ca4f41f5c 100644
--- a/target/i386/hvf/x86_flags.c
+++ b/target/i386/hvf/x86_flags.c
@@ -63,7 +63,7 @@
 #define SET_FLAGS_OSZAPC_SIZE(size, lf_carries, lf_result) { \
 target_ulong temp = ((lf_carries) & (LF_MASK_AF)) | \
 (((lf_carries) >> (size - 2)) << LF_BIT_PO); \
-env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
 if ((size) == 32) { \
 temp = ((lf_carries) & ~(LF_MASK_PDB | LF_MASK_SD)); \
 } else if ((size) == 16) { \
@@ -73,7 +73,7 @@
 } else { \
 VM_PANIC("unimplemented");  \
 } \
-env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)temp; \
+env->hvf_lflags.auxbits = (target_ulong)(uint32_t)temp; \
 }
 
 /* carries, result */
@@ -100,10 +100,10 @@
 } else { \
 VM_PANIC("unimplemented");  \
 } \
-env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
-target_ulong delta_c = (env->hvf_emul->lflags.auxbits ^ temp) & 
LF_MASK_CF; \
+env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+target_ulong delta_c = (env->hvf_lflags.auxbits ^ temp) & LF_MASK_CF; \
 delta_c ^= (delta_c >> 1); \
-env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
+env->hvf_lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
 }
 
 /* carries, result */
@@ -117,9 +117,8 @@
 void SET_FLAGS_OC(CPUX86State *env, uint32_t new_of, uint32_t new_cf)
 {
 uint32_t temp_po = new_of ^ new_cf;
-env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
-env->hvf_emul->lflags.auxbits |= (temp_po << LF_BIT_PO) |
- (new_cf << LF_BIT_CF);
+env->hvf_lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
+env->hvf_lflags.auxbits |= (temp_po << LF_BIT_PO) | (new_cf << LF_BIT_CF);
 }
 
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
@@ -215,27 +214,27 @@ void SET_FLAGS_OSZAPC_LOGIC8(CPUX86State *env, uint8_t 
v1, uint8_t v2,
 
 bool get_PF(CPUX86State *env)
 {
-uint32_t temp = (255 & env->hvf_emul->lflags.result);
-temp = temp ^ (255 & (env->hvf_emul->lflags.auxbits >> LF_BIT_PDB));
+uint32_t temp = (255 & env->hvf_lflags.result);
+temp = temp ^ (255 & (env->hvf_lflags.auxbits >> LF_BIT_PDB));
 temp = (temp ^ (temp >> 4)) & 0x0F;
 return (0x9669U >> temp) & 1;
 }
 
 void set_PF(CPUX86State *env, bool val)
 {
-uint32_t temp = (255 & env->hvf_emul->lflags.result) ^ (!val);
-env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PDB);
-env->hvf_emul->lflags.auxbits |= (temp << LF_BIT_PDB);
+uint32_t temp = (255 & env->hvf_lflags.result) ^ (!val);
+env->hvf_lflags.auxbits &= ~(LF_MASK_PDB);
+env->hvf_lflags.auxbits |= (temp << LF_BIT_PDB);
 }
 
 bool get_OF(CPUX86State *env)
 {
-return ((env->hvf_emul->lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) 
& 1;
+return

[PATCH 10/13] i386: hvf: Drop regs in HVFX86EmulatorState

2020-05-28 Thread Roman Bolshakov

HVFX86EmulatorState carries it's own copy of x86 registers. It can be
dropped in favor of regs in generic CPUX86State.

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/x86.h | 13 +++--
 target/i386/hvf/x86_emu.c | 18 +-
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index f0d03faff9..6048b5cc74 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -235,13 +235,14 @@ typedef struct lazy_flags {
 
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-struct x86_register regs[16];
 struct lazy_flags   lflags;
 uint8_t mmio_buf[4096];
 };
 
 /* useful register access  macros */
-#define RRX(cpu, reg) (cpu->hvf_emul->regs[reg].rrx)
+#define x86_reg(cpu, reg) ((x86_register *) >regs[reg])
+
+#define RRX(cpu, reg)   (x86_reg(cpu, reg)->rrx)
 #define RAX(cpu)RRX(cpu, R_EAX)
 #define RCX(cpu)RRX(cpu, R_ECX)
 #define RDX(cpu)RRX(cpu, R_EDX)
@@ -259,7 +260,7 @@ struct HVFX86EmulatorState {
 #define R14(cpu)RRX(cpu, R_R14)
 #define R15(cpu)RRX(cpu, R_R15)
 
-#define ERX(cpu, reg)   (cpu->hvf_emul->regs[reg].erx)
+#define ERX(cpu, reg)   (x86_reg(cpu, reg)->erx)
 #define EAX(cpu)ERX(cpu, R_EAX)
 #define ECX(cpu)ERX(cpu, R_ECX)
 #define EDX(cpu)ERX(cpu, R_EDX)
@@ -269,7 +270,7 @@ struct HVFX86EmulatorState {
 #define ESI(cpu)ERX(cpu, R_ESI)
 #define EDI(cpu)ERX(cpu, R_EDI)
 
-#define RX(cpu, reg)   (cpu->hvf_emul->regs[reg].rx)
+#define RX(cpu, reg)   (x86_reg(cpu, reg)->rx)
 #define AX(cpu)RX(cpu, R_EAX)
 #define CX(cpu)RX(cpu, R_ECX)
 #define DX(cpu)RX(cpu, R_EDX)
@@ -279,13 +280,13 @@ struct HVFX86EmulatorState {
 #define SI(cpu)RX(cpu, R_ESI)
 #define DI(cpu)RX(cpu, R_EDI)
 
-#define RL(cpu, reg)   (cpu->hvf_emul->regs[reg].lx)
+#define RL(cpu, reg)   (x86_reg(cpu, reg)->lx)
 #define AL(cpu)RL(cpu, R_EAX)
 #define CL(cpu)RL(cpu, R_ECX)
 #define DL(cpu)RL(cpu, R_EDX)
 #define BL(cpu)RL(cpu, R_EBX)
 
-#define RH(cpu, reg)   (cpu->hvf_emul->regs[reg].hx)
+#define RH(cpu, reg)   (x86_reg(cpu, reg)->hx)
 #define AH(cpu)RH(cpu, R_EAX)
 #define CH(cpu)RH(cpu, R_ECX)
 #define DH(cpu)RH(cpu, R_EDX)
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 04fac64e72..1ad2c30e16 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -95,13 +95,13 @@ target_ulong read_reg(CPUX86State *env, int reg, int size)
 {
 switch (size) {
 case 1:
-return env->hvf_emul->regs[reg].lx;
+return x86_reg(env, reg)->lx;
 case 2:
-return env->hvf_emul->regs[reg].rx;
+return x86_reg(env, reg)->rx;
 case 4:
-return env->hvf_emul->regs[reg].erx;
+return x86_reg(env, reg)->erx;
 case 8:
-return env->hvf_emul->regs[reg].rrx;
+return x86_reg(env, reg)->rrx;
 default:
 abort();
 }
@@ -112,16 +112,16 @@ void write_reg(CPUX86State *env, int reg, target_ulong 
val, int size)
 {
 switch (size) {
 case 1:
-env->hvf_emul->regs[reg].lx = val;
+x86_reg(env, reg)->lx = val;
 break;
 case 2:
-env->hvf_emul->regs[reg].rx = val;
+x86_reg(env, reg)->rx = val;
 break;
 case 4:
-env->hvf_emul->regs[reg].rrx = (uint32_t)val;
+x86_reg(env, reg)->rrx = (uint32_t)val;
 break;
 case 8:
-env->hvf_emul->regs[reg].rrx = val;
+x86_reg(env, reg)->rrx = val;
 break;
 default:
 abort();
@@ -173,7 +173,7 @@ void write_val_to_reg(target_ulong reg_ptr, target_ulong 
val, int size)
 
 static bool is_host_reg(struct CPUX86State *env, target_ulong ptr)
 {
-return (ptr - (target_ulong)>hvf_emul->regs[0]) < 
sizeof(env->hvf_emul->regs);
+return (ptr - (target_ulong)>regs[0]) < sizeof(env->regs);
 }
 
 void write_val_ext(struct CPUX86State *env, target_ulong ptr, target_ulong 
val, int size)
-- 
2.26.1

[PATCH 05/13] i386: hvf: Use ins_len to advance IP

2020-05-28 Thread Roman Bolshakov

There's no need to read VMCS twice, instruction length is already
available in ins_len.

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/hvf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 9ccdb7e7c7..8ff1d25521 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -871,7 +871,7 @@ int hvf_vcpu_exec(CPUState *cpu)
 } else {
 simulate_wrmsr(cpu);
 }
-RIP(env) += rvmcs(cpu->hvf_fd, VMCS_EXIT_INSTRUCTION_LENGTH);
+RIP(env) += ins_len;
 store_regs(cpu);
 break;
 }
-- 
2.26.1

[PATCH 04/13] i386: hvf: Drop unused variable

2020-05-28 Thread Roman Bolshakov

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/x86.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index c95d5b2116..56fcde13c6 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -293,7 +293,6 @@ typedef struct lazy_flags {
 
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-int interruptable;
 uint64_t fetch_rip;
 uint64_t rip;
 struct x86_register regs[16];
-- 
2.26.1

[PATCH 07/13] i386: hvf: Drop fetch_rip from HVFX86EmulatorState

2020-05-28 Thread Roman Bolshakov

The field is used to print address of instructions that have no parser
in decode_invalid(). RIP from VMCS is saved into fetch_rip before
decoding starts but it's also saved into env->eip in load_regs().
Therefore env->eip can be used instead of fetch_rip.

While at it, correct address printed in decode_invalid(). It prints an
address before the unknown instruction.

Signed-off-by: Roman Bolshakov 
---
 target/i386/hvf/hvf.c| 6 --
 target/i386/hvf/x86.h| 1 -
 target/i386/hvf/x86_decode.c | 3 +--
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 45ae55dd27..416a6fae7c 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -767,8 +767,6 @@ int hvf_vcpu_exec(CPUState *cpu)
 struct x86_decode decode;
 
 load_regs(cpu);
-env->hvf_emul->fetch_rip = rip;
-
 decode_instruction(env, );
 exec_instruction(env, );
 store_regs(cpu);
@@ -809,8 +807,6 @@ int hvf_vcpu_exec(CPUState *cpu)
 struct x86_decode decode;
 
 load_regs(cpu);
-env->hvf_emul->fetch_rip = rip;
-
 decode_instruction(env, );
 assert(ins_len == decode.len);
 exec_instruction(env, );
@@ -915,8 +911,6 @@ int hvf_vcpu_exec(CPUState *cpu)
 struct x86_decode decode;
 
 load_regs(cpu);
-env->hvf_emul->fetch_rip = rip;
-
 decode_instruction(env, );
 exec_instruction(env, );
 store_regs(cpu);
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index e3ab7c5137..411e4b6599 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -293,7 +293,6 @@ typedef struct lazy_flags {
 
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-uint64_t fetch_rip;
 struct x86_register regs[16];
 struct x86_reg_flags   rflags;
 struct lazy_flags   lflags;
diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index a590088f54..d881542181 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -29,8 +29,7 @@
 
 static void decode_invalid(CPUX86State *env, struct x86_decode *decode)
 {
-printf("%llx: failed to decode instruction ", env->hvf_emul->fetch_rip -
-   decode->len);
+printf("%llx: failed to decode instruction ", env->eip);
 for (int i = 0; i < decode->opcode_len; i++) {
 printf("%x ", decode->opcode[i]);
 }
-- 
2.26.1

[PATCH 00/13] i386: hvf: Remove HVFX86EmulatorState

2020-05-28 Thread Roman Bolshakov

Hi,

This is a cleanup series for HVF accel.

HVF is using two emulator states CPUX86State and HVFX86EmulatorState
simultaneously. HVFX86EmulatorState is used for instruction emulation.
CPUX86State is used in all other places. Sometimes the states are in
sync, sometimes they're not. It complicates reasoning about emulator
behaviour given that there's a third state - VMCS.

The series tries to leverage CPUX86State for instruction decoding and
removes HVFX86EmulatorState. I had to add two new hvf-specific fields to
CPUX86State: lazy_flags and mmio_buf. It's likely that cc_op, cc_dst,
etc could be reused for lazy_flags but it'd require major rework of flag
processing during instruction emulation. Hopefully that'll happen too in
the future.

I tried to include sysemu/hvf.h into target/i386/cpu.h to add definition
of hvf lazy flags but couldn't do that at first it because it introduced
circular dependency between existing sysemu/hvf.h and cpu.h. The first
three patches untangle and prune sysemu/hvf.h to the bare minimum to
allow inclusion of sysemu/hvf.h into target/i386/cpu.h.

This might conflict with [1], but merge/rebase should be trivial.

1. https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg07449.html

Thanks,
Roman

Roman Bolshakov (13):
  i386: hvf: Move HVFState definition into hvf
  i386: hvf: Drop useless declarations in sysemu
  i386: hvf: Clean stray includes in sysemu
  i386: hvf: Drop unused variable
  i386: hvf: Use ins_len to advance IP
  i386: hvf: Use IP from CPUX86State
  i386: hvf: Drop fetch_rip from HVFX86EmulatorState
  i386: hvf: Drop rflags from HVFX86EmulatorState
  i386: hvf: Drop copy of RFLAGS defines
  i386: hvf: Drop regs in HVFX86EmulatorState
  i386: hvf: Move lazy_flags into CPUX86State
  i386: hvf: Move mmio_buf into CPUX86State
  i386: hvf: Drop HVFX86EmulatorState

 include/qemu/typedefs.h  |   1 -
 include/sysemu/hvf.h |  73 ++---
 target/i386/cpu.h|   4 +-
 target/i386/hvf/hvf-i386.h   |  35 ++
 target/i386/hvf/hvf.c|  30 -
 target/i386/hvf/x86.c|   2 +-
 target/i386/hvf/x86.h|  89 ++---
 target/i386/hvf/x86_decode.c |  25 ---
 target/i386/hvf/x86_emu.c| 122 +--
 target/i386/hvf/x86_flags.c  |  81 ---
 target/i386/hvf/x86_task.c   |  10 +--
 target/i386/hvf/x86hvf.c |   6 +-
 12 files changed, 186 insertions(+), 292 deletions(-)

-- 
2.26.1

1 2 3 4 >

1 - 100 of 344 matches

Mail list logo