date:20200527

Re: [PATCH 09/24] macio: Fix to realize "mos6522-cuda" and "mos6522-pmu" devices

2020-05-27 Thread Markus Armbruster

Paolo Bonzini  writes:

> On 27/05/20 17:05, Peter Maydell wrote:
>> I disagree with these. We're in a realize function, the API
>> says "on errors, report them via the Error* you got passed",
>> so we should do that, not blow up. &error_abort only makes
>> sense if (a) we have no better way to report errors than
>> to abort (which isn't the case here) or (b) if we can guarantee
>> that in fact the thing we're doing won't ever fail
>> (which we can't here without knowing more about the internal
>> implementation details of the MOS6522 device than we
>> really ought to).
>
> Note however that before replacing &error_abort with error propagation
> you need to make sure that you are "un-realizing" yourself properly.  So
> it may be better to have inferior (but clearly visible) error
> propagation behavior, than untested (and perhaps untestable) buggy code
> that looks great on the surface.

This is exactly why I have to stop at &error_abort in this series.  It's
24 patches of fixes to enable 50+ patches of refactoring, with more in
the pipeline.  If I stray even deeper into the weeds, my pipeline is
going to explode %-}

Re: [PATCH 7/7] linux-user: limit check to HOST_LONG_BITS < TARGET_ABI_BITS

2020-05-27 Thread Thomas Huth

On 27/05/2020 18.36, Alex Bennée wrote:
> 
> Thomas Huth  writes:
> 
>> On 27/05/2020 16.44, Laurent Vivier wrote:
>>> Le 25/05/2020 à 15:18, Thomas Huth a écrit :
 From: Alex Bennée 

 Newer clangs rightly spot that you can never exceed the full address
 space of 64 bit hosts with:

   linux-user/elfload.c:2076:41: error: result of comparison 'unsigned
   long' > 18446744073709551615 is always false
   [-Werror,-Wtautological-type-limit-compare]
   4685 if ((guest_hiaddr - guest_base) > ~(uintptr_t)0) {
   4686 ~~~ ^ ~
   4687 1 error generated.

 So lets limit the check to 32 bit hosts only.

 Fixes: ee94743034bf
 Reported-by: Thomas Huth 
 Signed-off-by: Alex Bennée 
 [thuth: Use HOST_LONG_BITS < TARGET_ABI_BITS instead of HOST_LONG_BITS == 
 32]
 Signed-off-by: Thomas Huth 
 ---
  linux-user/elfload.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/linux-user/elfload.c b/linux-user/elfload.c
 index 01a9323a63..ebc663ea0b 100644
 --- a/linux-user/elfload.c
 +++ b/linux-user/elfload.c
 @@ -2073,12 +2073,14 @@ static void pgb_have_guest_base(const char 
 *image_name, abi_ulong guest_loaddr,
  exit(EXIT_FAILURE);
  }
  } else {
 +#if HOST_LONG_BITS < TARGET_ABI_BITS
  if ((guest_hiaddr - guest_base) > ~(uintptr_t)0) {
  error_report("%s: requires more virtual address space "
   "than the host can provide (0x%" PRIx64 ")",
   image_name, (uint64_t)guest_hiaddr - guest_base);
  exit(EXIT_FAILURE);
  }
 +#endif
  }
  
  /*

>>>
>>> Philippe sent the same patch:
>>>
>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg699796.html
>>
>> Indeed, but looking more closely, he's using slightly different
>> locations for the #if and #endif ... not sure what's better though...?
> 
> Richard was more inclined to suppress the warning:
> 
>   Subject: Re: [PATCH v2] linux-user: limit check to HOST_LONG_BITS == 32
>   From: Richard Henderson 
>   Message-ID: <3069bc1b-115d-f361-8271-c775bf695...@linaro.org>
>   Date: Thu, 21 May 2020 20:15:51 -0700
> 
> One reason I dropped the f32 patch from my last PR was because this
> wasn't the only warning the latest clang picks up.

... but this is currently the only spot that is required to get the
gitlab CI going again, so I think we should include this patch until we
have a final decision whether to disable the warning or not (and we can
still revert this patch after we disabled the warning). Ok?

 Thomas

Re: [PATCH v2 0/3] account for NVDIMM nodes during SRAT generation

2020-05-27 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200528054807.21278-1-vishal.l.ve...@intel.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200528054807.21278-1-vishal.l.ve...@intel.com
Subject: [PATCH v2 0/3] account for NVDIMM nodes during SRAT generation
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
92f6e3c tests/acpi: update expected SRAT files
325a20a hw/acpi-build: account for NVDIMM numa nodes in SRAT
ff8f589 diffs-allowed: add the SRAT AML to diffs-allowed

=== OUTPUT BEGIN ===
1/3 Checking commit ff8f5897d948 (diffs-allowed: add the SRAT AML to 
diffs-allowed)
2/3 Checking commit 325a20aae003 (hw/acpi-build: account for NVDIMM numa nodes 
in SRAT)
3/3 Checking commit 92f6e3cdac4c (tests/acpi: update expected SRAT files)
ERROR: Do not add expected files together with tests, follow instructions in 
tests/qtest/bios-tables-test.c: both tests/data/acpi/q35/SRAT.dimmpxm and 
tests/qtest/bios-tables-test-allowed-diff.h found

ERROR: Do not add expected files together with tests, follow instructions in 
tests/qtest/bios-tables-test.c: both tests/data/acpi/q35/SRAT.dimmpxm and 
tests/qtest/bios-tables-test-allowed-diff.h found

total: 2 errors, 0 warnings, 1 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200528054807.21278-1-vishal.l.ve...@intel.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: USB pass-through problems

2020-05-27 Thread Gerd Hoffmann

On Wed, May 27, 2020 at 09:44:54PM +0200, BALATON Zoltan wrote:
> Hello,
> 
> I've seen a case when QEMU hangs with a passed through USB device. This is
> with -device usb-ehci and pass through with usb-host. This works until the
> attached USB device reboots (so likely it disconnects and reconnects) at
> which point QEMU hangs and need to be SIGKILL-ed to end (that's a bit hard
> to do with mouse and keyboard grabbed). I've got this stack trace:
> 
> #0  0x7f23e7bd4949 in poll () at /lib64/libc.so.6
> #1  0x7f23e8bfa9a5 in  () at /lib64/libusb-1.0.so.0
> #2  0x7f23e8bfbb13 in libusb_handle_events_timeout_completed () at 
> /lib64/libusb-1.0.so.0
> #3  0x55e09854b7da in usb_host_abort_xfers (s=0x55e09b036dd0) at 
> hw/usb/host-libusb.c:963
> #4  0x55e09854b87a in usb_host_close (s=0x55e09b036dd0) at 
> hw/usb/host-libusb.c:977
> #5  0x55e09854b92e in usb_host_nodev_bh (opaque=0x55e09b036dd0) at 
> hw/usb/host-libusb.c:998
> #6  0x55e098757500 in aio_bh_call (bh=0x55e099ad9cc0) at util/async.c:136
> #7  0x55e09875760a in aio_bh_poll (ctx=0x55e0996c2620) at util/async.c:164
> #8  0x55e09875cb2a in aio_dispatch (ctx=0x55e0996c2620) at 
> util/aio-posix.c:380
> #9  0x55e098757a3d in aio_ctx_dispatch (source=0x55e0996c2620, 
> callback=0x0, user_data=0x0) at util/async.c:306
> #10 0x7f23e8c59665 in g_main_context_dispatch () at 
> /lib64/libglib-2.0.so.0
> #11 0x55e09875b0a9 in glib_pollfds_poll () at util/main-loop.c:219
> #12 0x55e09875b123 in os_host_main_loop_wait (timeout=0) at 
> util/main-loop.c:242
> #13 0x55e09875b228 in main_loop_wait (nonblocking=0) at 
> util/main-loop.c:518
> #14 0x55e0982c91f8 in qemu_main_loop () at softmmu/vl.c:1664
> #15 0x55e098162e7e in main (argc=, argv=, 
> envp=) at softmmu/main.c:49
> 
> so the problem may be in libusb but QEMU should not hang completely. The
> host is Linux with libusb 1.0.22.

Hmm, does reverting 76d0a9362c6a6a7d88aa18c84c4186c9107ecaef change
behavior?

cheers,
  Gerd

Re: [PATCH v6 1/3] memory: drop guest writes to read-only ram device regions

2020-05-27 Thread Yan Zhao

On Thu, May 28, 2020 at 07:10:46AM +0200, Paolo Bonzini wrote:
> On 28/05/20 06:35, Yan Zhao wrote:
> > On Tue, May 26, 2020 at 10:26:35AM +0100, Peter Maydell wrote:
> >> On Mon, 25 May 2020 at 11:20, Paolo Bonzini  wrote:
> >>> Not all of them, only those that need to return MEMTX_ERROR.  I would
> >>> like some guidance from Peter as to whether (or when) reads from ROMs
> >>> should return MEMTX_ERROR.  This way, we can use that information to
> >>> device  what the read-only ram-device regions should do.
> >>
> >> In general I think writes to ROMs (and indeed reads from ROMs) should
> >> not return MEMTX_ERROR. I think that in real hardware you could have
> >> a ROM that behaved either way; so our default behaviour should probably
> >> be to do what we've always done and not report a MEMTX_ERROR. (If we
> >> needed to I suppose we should implement a MEMTX_ERROR-reporting ROM,
> >> but to be honest there aren't really many real ROMs in systems these
> >> days: it's more often flash, whose response to writes is defined
> >> by the spec and is I think to ignore writes which aren't the
> >> magic "shift to program-the-flash-mode" sequence.)
> >>
> > then should I just drop the writes to read-only ram-device regions and
> > vfio regions without returning MEMTX_ERROR?
> > do you think it's good?
> 
> I am not really sure, I have to think more about it.  I think read-only
> RAMD regions are slightly different because the guest can expect "magic"
> behavior from RAMD regions (e.g. registers that trigger I/O on writes)
> that are simply not there for ROM.  So I'm still inclined to queue your
> v6 patch series.
> 
ok. thank you Paolo. :)

Re: [PATCH v5 0/7] coroutines: generate wrapper code

2020-05-27 Thread Vladimir Sementsov-Ogievskiy


28.05.2020 00:57, Eric Blake wrote:

On 5/27/20 4:46 PM, no-re...@patchew.org wrote:

Patchew URL: 
https://patchew.org/QEMU/20200527203733.16129-1-vsement...@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

TypeError: __init__() got an unexpected keyword argument 'capture_output'
   CC  /tmp/qemu-test/build/slirp/src/bootp.o
   GEN ui/input-keymap-usb-to-qcode.c
make: *** [block/block-gen.c] Error 1
make: *** Deleting file `block/block-gen.c'
make: *** Waiting for unfinished jobs
   GEN ui/input-keymap-win32-to-qcode.c


The more interesting part of the failure:

   File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 173, in 
     print(gen_wrappers_file(sys.stdin.read()))
   File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 169, in 
gen_wrappers_file
     return prettify(res)  # prettify to wrap long lines
   File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 40, in prettify
     encoding='utf-8', input=code, capture_output=True)
   File "/usr/lib64/python3.6/subprocess.py", line 423, in run
     with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'

which indeed looks like a bug in the patch.



Ah, yes, capture_output is since python 3.7.

So, s/capture_output=True/stdout=subprocess.PIPE/ .

--
Best regards,
Vladimir

Re: [Bug 1881004] [NEW] fpu/softfloat.c: error: bitwise negation of a boolean expression

2020-05-27 Thread Thomas Huth

On 27/05/2020 23.54, Eric Blake wrote:
> On 5/27/20 4:40 PM, Peter Maydell wrote:
>> On Wed, 27 May 2020 at 20:21, Philippe Mathieu-Daudé
>> <1881...@bugs.launchpad.net> wrote:
>>>
>>> Public bug reported:
>>>
>>> Last time I built QEMU was on commit
>>> d5c75ec500d96f1d93447f990cd5a4ef5ba27fae,
>>> I just pulled to fea8f3ed739536fca027cf56af7f5576f37ef9cd and now get:
>>>
>>>    CC  lm32-softmmu/fpu/softfloat.o
>>> fpu/softfloat.c:3365:13: error: bitwise negation of a boolean
>>> expression; did you mean logical negation? [-Werror,-Wbool-operation]
>>>  absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
>>>  ^~
>>>  !
>>
>>
>> "(x & y)" is not a boolean expression, so we should report this to clang
>> as a bug (I assume what they actually are trying to complain about is
>> bitwise AND with a boolean expression).
> 
> We have:
> 
> uint64_t &= ~ ( ( ( int8_t ^ int ) == int ) & bool )
> 
> which is
> 
> uint64_t &= ~ ( bool & bool )
> 
> which is then
> 
> uint64_t &= ~ ( int )
> 
> resulting in one of:
> 
> uint64_t &= 0x
> uint64_t &= 0xfffe
> 
> It is a very odd way of stating that 'if this condition is true, mask
> out the least-significant-bit'.  In general, 'bool & bool' is used where
> the side-effect-skipping 'bool && bool' is inappropriate; I'm a bit
> surprised that clang is not questioning whether we meant '&&' instead of
> '&' (the two operators give the same effect in this case).
> 
> You are right that clang is fishy for calling it logical negation of a
> bool, when it is really logical negation of an int, but we are also
> fishy in that we are using bitwise AND of two bools as an int in the
> first place.
> 
> Regardless of whether clang changes, would it be better to write the
> code as:
> 
> if (((roundBits ^ 0x40) == 0) && roundNearestEven) {
>     absZ &= ~1;
> }

I agree, that's also much better to read.
And FWIW, WinUAE fixed a similar problem in the same way recently:

 https://github.com/tonioni/WinUAE/commit/51f5e7bfc39cf37daf7283

So I think this is the right way to go. Could you send your suggestion
as a proper patch?

 Thanks,
  Thomas

Re: [PATCH v2 11/11] tests/acceptance: Linux boot test for record/replay

2020-05-27 Thread Pavel Dovgalyuk




On 27.05.2020 19:44, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


This patch adds a test for record/replay, which boots Linux
image from the disk and interacts with the network.
The idea and code of this test is borrowed from boot_linux.py
However, currently record/replay works only for x86_64,
therefore other tests were excluded.

Each test consists of the following phases:
  - downloading the disk image
  - recording the execution
  - replaying the execution

Replay does not validates the output, but waits until QEMU
finishes the execution. This is reasonable, because
QEMU usually hangs when replay goes wrong.

Two things:

  - We need to tag these tests as slow so they aren't run by default


Which flag is responsible for this?


  - 1800s is a long timeout to wait for to know it's a problem


Right, I just doubled boot_linux timeout. I think, that it could be reduced.



Looking at the log shows my test is still running? Maybe we can check
the output as we go?


How this could look like?




Signed-off-by: Pavel Dovgalyuk 
---
  0 files changed

diff --git a/MAINTAINERS b/MAINTAINERS
index e9a9ce4f66..97f066a9b2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2498,6 +2498,7 @@ F: include/sysemu/replay.h
  F: docs/replay.txt
  F: stubs/replay.c
  F: tests/acceptance/replay_kernel.py
+F: tests/acceptance/replay_linux.py
  
  IOVA Tree

  M: Peter Xu 
diff --git a/tests/acceptance/replay_linux.py b/tests/acceptance/replay_linux.py
new file mode 100644
index 00..7c5971f156
--- /dev/null
+++ b/tests/acceptance/replay_linux.py
@@ -0,0 +1,97 @@
+# Record/replay test that boots a complete Linux system via a cloud image
+#
+# Copyright (c) 2020 ISP RAS
+#
+# Author:
+#  Pavel Dovgalyuk 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+import os
+
+from avocado.utils import cloudinit
+from avocado.utils import network
+from avocado.utils import vmimage
+from avocado.utils import datadrainer
+from avocado.utils.path import find_command
+from boot_linux import BootLinuxBase
+
+class ReplayLinux(BootLinuxBase):
+"""
+Boots a Linux system, checking for a successful initialization
+"""
+
+timeout = 1800
+chksum = None
+hdd = 'ide-hd'
+cd = 'ide-cd'
+bus = 'ide'
+
+def setUp(self):
+super(ReplayLinux, self).setUp()
+self.boot_path = self.download_boot()
+self.cloudinit_path = self.download_cloudinit()
+
+def vm_add_disk(self, vm, path, id, device):
+bus_string = ''
+if self.bus:
+bus_string = ',bus=%s.%d' % (self.bus, id,)
+vm.add_args('-drive', 'file=%s,snapshot,id=disk%s,if=none' % (path, 
id))
+vm.add_args('-drive', 
'driver=blkreplay,id=disk%s-rr,if=none,image=disk%s' % (id, id))
+vm.add_args('-device', '%s,drive=disk%s-rr%s' % (device, id, 
bus_string))
+
+def launch_and_wait(self, record, args, shift):
+vm = self.get_vm()
+vm.add_args('-smp', '1')
+vm.add_args('-m', '1024')
+vm.add_args('-object', 'filter-replay,id=replay,netdev=hub0port0')
+if args:
+vm.add_args(*args)
+self.vm_add_disk(vm, self.boot_path, 0, self.hdd)
+self.vm_add_disk(vm, self.cloudinit_path, 1, self.cd)
+if record:
+mode = 'record'
+else:
+mode = 'replay'
+vm.add_args('-icount', 'shift=%s,rr=%s,rrfile=%s' %
+(shift, mode, os.path.join(self.workdir, 'replay.bin')))
+
+vm.set_console()
+vm.launch()
+console_drainer = datadrainer.LineLogger(vm.console_socket.fileno(),
+ 
logger=self.log.getChild('console'),
+ stop_check=(lambda : not 
vm.is_running()))
+console_drainer.start()
+if record:
+self.log.info('VM launched, waiting for boot confirmation from 
guest')
+cloudinit.wait_for_phone_home(('0.0.0.0', self.phone_home_port), 
self.name)
+vm.shutdown()
+else:
+self.log.info('VM launched, waiting the recorded execution to be 
replayed')
+vm.wait()
+
+def run_rr(self, args=None, shift=7):
+self.launch_and_wait(True, args, shift)
+self.launch_and_wait(False, args, shift)
+
+class ReplayLinuxX8664(ReplayLinux):
+"""
+:avocado: tags=arch:x86_64
+"""
+
+chksum = 'e3c1b309d9203604922d6e255c2c5d098a309c2d46215d8fc026954f3c5c27a0'
+
+def test_pc_i440fx(self):
+"""
+:avocado: tags=machine:pc
+:avocado: tags=accel:tcg
+"""
+self.run_rr(shift=1)
+
+def test_pc_q35(self):
+"""
+:avocado: tags=machine:q35
+:avocado: tags=accel:tcg
+"""
+self.run_rr(shift=3)

Re: [PATCH v2 04/11] tests/acceptance: add kernel record/replay test for x86_64

2020-05-27 Thread Pavel Dovgalyuk




On 27.05.2020 18:41, Alex Bennée wrote:

Pavel Dovgalyuk  writes:


This patch adds a test for record/replay an execution of x86_64 machine.
Execution scenario includes simple kernel boot, which allows testing
basic hardware interaction in RR mode.

Signed-off-by: Pavel Dovgalyuk 
---
  0 files changed

diff --git a/tests/acceptance/replay_kernel.py 
b/tests/acceptance/replay_kernel.py
index b8b277ad2f..c7526f1aba 100644
--- a/tests/acceptance/replay_kernel.py
+++ b/tests/acceptance/replay_kernel.py
@@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
  True, shift, args)
  self.run_vm(kernel_path, kernel_command_line, console_pattern,
  False, shift, args)
+
+def test_x86_64_pc(self):
+"""
+:avocado: tags=arch:x86_64
+:avocado: tags=machine:pc
+"""
+kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+  '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
+  '/vmlinuz')
+kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
+kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+
+kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
+console_pattern = 'Kernel command line: %s' % kernel_command_line
+
+self.run_rr(kernel_path, kernel_command_line, console_pattern)

This test fails for me on the replay:


Have you applied latest RR patches?




   2020-05-27 16:22:21,658 machine  L0326 DEBUG| VM launch command: 
'x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev 
socket,id=mon,path=/var/tmp/tmp4n_geosi/qemu-9516-monitor.sock -mon 
chardev=mon,mode=control -machine pc -chardev 
socket,id=console,path=/var/tmp/tmp4n_geosi/qemu-9516-console.sock,server,nowait
 -serial chardev:console -icount 
shift=7,rr=replay,rrfile=/var/tmp/avocado_b85h3ycg/avocado_job_8xrxksgj/1-._tests_acceptance_replay_kernel.py_ReplayKernel.test_x86_64_pc/replay.bin
 -kernel 
/home/alex/avocado/data/cache/by_location/df533120a0e0ffda2626bed6e8a975d3b07e3f05/vmlinuz
 -append printk.time=0 console=ttyS0 -net none'
   2020-05-27 16:22:21,725 qmp  L0194 DEBUG| >>> {'execute': 
'qmp_capabilities'}
   2020-05-27 16:22:21,736 qmp  L0202 DEBUG| <<< {'return': {}}
   2020-05-27 16:23:49,372 stacktrace   L0039 ERROR|
   2020-05-27 16:23:49,372 stacktrace   L0042 ERROR| Reproduced traceback 
from: 
/home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/core/test.py:860
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR| Traceback (most recent 
call last):
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 
73, in test_x86_64_pc
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR| 
self.run_rr(kernel_path, kernel_command_line, console_pattern)
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 
57, in run_rr
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR| False, shift, args)
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 
46, in run_vm
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR| 
self.wait_for_console_pattern(console_pattern, vm)
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/boot_linux_console.py", 
line 37, in wait_for_console_pattern
   2020-05-27 16:23:49,373 stacktrace   L0045 ERROR| vm=vm)
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py",
 line 131, in wait_for_console_pattern
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR| 
_console_interaction(test, success_message, failure_message, None, vm=vm)
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py",
 line 83, in _console_interaction
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR| msg = 
console.readline().strip()
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR|   File 
"/usr/lib/python3.7/socket.py", line 589, in readinto
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR| return 
self._sock.recv_into(b)
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR|   File 
"/home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/plugins/runner.py",
 line 89, in sigterm_handler
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR| raise 
RuntimeError("Test interrupted by SIGTERM")
   2020-05-27 16:23:49,374 stacktrace   L0045 ERROR| RuntimeError: Test 
interrupted by SIGTERM

Re: [PATCH v3 0/3] RTISC-V: Remove deprecated ISA, CPUs and machines

2020-05-27 Thread Thomas Huth

On 27/05/2020 19.35, Alistair Francis wrote:
> On Wed, May 27, 2020 at 12:17 AM Thomas Huth  wrote:
>>
>> On 27/05/2020 00.47, Alistair Francis wrote:
>>>
>>>  include/hw/riscv/spike.h  |   6 +-
>>>  target/riscv/cpu.h|   8 -
>>>  hw/riscv/spike.c  | 217 --
>>>  target/riscv/cpu.c|  30 ---
>>>  target/riscv/cpu_helper.c |  82 +++
>>>  target/riscv/csr.c| 118 ++
>>>  .../riscv/insn_trans/trans_privileged.inc.c   |  18 +-
>>>  target/riscv/monitor.c|   5 -
>>>  target/riscv/op_helper.c  |  17 +-
>>>  tests/qtest/machine-none-test.c   |   4 +-
>>>  10 files changed, 60 insertions(+), 445 deletions(-)
>>
>>  Hi,
>>
>> looking at the diffstat, I think you should also edit
>> ./docs/system/deprecated.rst according to your changes?
> 
> I'm not sure what I should edit there. These are already marked as
> deprecated, do I need to make a change when they are removed?

Yes, you should move the features to the "Recently removed features"
section at the end of the file. See e.g. commit b4983c570c7a5848c9df.

 Thomas

[PATCH v2 3/3] tests/acpi: update expected SRAT files

2020-05-27 Thread Vishal Verma

Update the expected SRAT files for the change to account for NVDIMM numa
nodes in the SRAT.

AML Diff:
  --- /tmp/asl-V49YJ0.dsl   2020-04-27 18:50:52.680043327 -0600
  +++ /tmp/asl-48AZJ0.dsl   2020-04-27 18:50:52.679043344 -0600
  @@ -3,7 +3,7 @@
* AML/ASL+ Disassembler version 20190509 (64-bit version)
* Copyright (c) 2000 - 2019 Intel Corporation
*
  - * Disassembly of tests/data/acpi/pc/SRAT.dimmpxm, Mon Apr 27 18:50:52 2020
  + * Disassembly of /tmp/aml-U3BZJ0, Mon Apr 27 18:50:52 2020
*
* ACPI Data Table [SRAT]
*
  @@ -13,7 +13,7 @@
   [000h    4]Signature : "SRAT"[System Resource 
Affinity Table]
   [004h 0004   4] Table Length : 0188
   [008h 0008   1] Revision : 01
  -[009h 0009   1] Checksum : 80
  +[009h 0009   1] Checksum : 68
   [00Ah 0010   6]   Oem ID : "BOCHS "
   [010h 0016   8] Oem Table ID : "BXPCSRAT"
   [018h 0024   4] Oem Revision : 0001
  @@ -140,15 +140,15 @@
   [138h 0312   1]Subtable Type : 01 [Memory Affinity]
   [139h 0313   1]   Length : 28

  -[13Ah 0314   4] Proximity Domain : 
  +[13Ah 0314   4] Proximity Domain : 0002
   [13Eh 0318   2]Reserved1 : 
  -[140h 0320   8] Base Address : 
  -[148h 0328   8]   Address Length : 
  +[140h 0320   8] Base Address : 00010800
  +[148h 0328   8]   Address Length : 0800
   [150h 0336   4]Reserved2 : 
  -[154h 0340   4]Flags (decoded below) : 
  - Enabled : 0
  +[154h 0340   4]Flags (decoded below) : 0005
  + Enabled : 1
  Hot Pluggable : 0
  -Non-Volatile : 0
  +Non-Volatile : 1
   [158h 0344   8]Reserved3 : 

   [160h 0352   1]Subtable Type : 01 [Memory Affinity]
  @@ -167,7 +167,7 @@

   Raw Table Data: Length 392 (0x188)

  -: 53 52 41 54 88 01 00 00 01 80 42 4F 43 48 53 20  // SRAT..BOCHS
  +: 53 52 41 54 88 01 00 00 01 68 42 4F 43 48 53 20  // SRAT.hBOCHS
   0010: 42 58 50 43 53 52 41 54 01 00 00 00 42 58 50 43  // 
BXPCSRATBXPC
   0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // 

   0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // 

  @@ -186,9 +186,9 @@
   0100: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  // 

   0110: 01 28 03 00 00 00 00 00 00 00 00 06 00 00 00 00  // 
.(..
   0120: 00 00 00 02 00 00 00 00 00 00 00 00 01 00 00 00  // 

  -0130: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00  // 
.(..
  -0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 

  -0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 

  +0130: 00 00 00 00 00 00 00 00 01 28 02 00 00 00 00 00  // 
.(..
  +0140: 00 00 00 08 01 00 00 00 00 00 00 08 00 00 00 00  // 

  +0150: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00  // 

   0160: 01 28 03 00 00 00 00 00 00 00 00 00 01 00 00 00  // 
.(..
   0170: 00 00 00 F8 00 00 00 00 00 00 00 00 03 00 00 00  // 

   0180: 00 00 00 00 00 00 00 00  // 

Signed-off-by: Vishal Verma 
---
 tests/data/acpi/pc/SRAT.dimmpxm | Bin 392 -> 392 bytes
 tests/data/acpi/q35/SRAT.dimmpxm| Bin 392 -> 392 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   1 -
 3 files changed, 1 deletion(-)

diff --git a/tests/data/acpi/pc/SRAT.dimmpxm b/tests/data/acpi/pc/SRAT.dimmpxm
index 
f5c0267ea24bb404b6b4e687390140378fbdc3f1..5a13c61b9041c6045c29643bf93a111fb1c0c76a
 100644
GIT binary patch
delta 51
scmeBR?qKE$4ss0XU}Rum%-G0fz$nec00kUCF%aN@Pz(&LlS3Je0lmQmhyVZp

delta 51
icmeBR?qKE$4ss0XU}RumY}m+Uz$ndt8%z#mGzI{_tp$hx

diff --git a/tests/data/acpi/q35/SRAT.dimmpxm b/tests/data/acpi/q35/SRAT.dimmpxm
index 
f5c0267ea24bb404b6b4e687390140378fbdc3f1..5a13c61b9041c6045c29643bf93a111fb1c0c76a
 100644
GIT binary patch
delta 51
scmeBR?qKE$4ss0XU}Rum%-G0fz$nec00kUCF%aN@Pz(&LlS3Je0lmQmhyVZp

delta 51
icmeBR?qKE$4ss0XU}RumY}m+Uz$ndt8%z#mGzI{_tp$hx

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 83d3ea5032..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/SRAT.dimmpxm",
-- 
2.26.2

[PATCH v2 0/3] account for NVDIMM nodes during SRAT generation

2020-05-27 Thread Vishal Verma

Changes since v1:
- Use error_abort for getters (Igor)
- Free the device list (Igor)
- Refactor the NVDIMM related portion into hw/acpi/nvdimm.c (Igor)
- Rebase onto latest master
- Add Jingqi's Reviewed-by

On the command line, one can specify a NUMA node for NVDIMM devices. If
we set up the topology to give NVDIMMs their own nodes, i.e. not
containing any CPUs or regular memory, qemu doesn't populate SRAT memory
affinity structures for these nodes. However the NFIT does reference
those proximity domains.

As a result, Linux, while parsing the SRAT, fails to initialize node
related structures for these nodes, and they never end up in the
nodes_possible map. When these are onlined at a later point (via
hotplug), this causes problems.

I've followed the instructions in bios-tables-test.c to update the
expected SRAT binary, and the tests (make check) pass. Patches 1 and 3
are the relevant ones for the binary update.

Patch 2 is the main patch which changes SRAT generation.

Vishal Verma (3):
  diffs-allowed: add the SRAT AML to diffs-allowed
  hw/acpi-build: account for NVDIMM numa nodes in SRAT
  tests/acpi: update expected SRAT files

 hw/acpi/nvdimm.c |  26 ++
 hw/i386/acpi-build.c |  10 ++
 include/hw/mem/nvdimm.h  |   1 +
 tests/data/acpi/pc/SRAT.dimmpxm  | Bin 392 -> 392 bytes
 tests/data/acpi/q35/SRAT.dimmpxm | Bin 392 -> 392 bytes
 5 files changed, 37 insertions(+)

-- 
2.26.2

[PATCH v2 1/3] diffs-allowed: add the SRAT AML to diffs-allowed

2020-05-27 Thread Vishal Verma

In anticipation of a change to the SRAT generation in qemu, add the AML
file to diffs-allowed.

Signed-off-by: Vishal Verma 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..83d3ea5032 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/SRAT.dimmpxm",
-- 
2.26.2

[PATCH v2 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT

2020-05-27 Thread Vishal Verma

NVDIMMs can belong to their own proximity domains, as described by the
NFIT. In such cases, the SRAT needs to have Memory Affinity structures
in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node
data structures properly during NUMA initialization. See the following
for an example failure case.

https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.ve...@intel.com/

Fix this by adding device address range and node information from
NVDIMMs to the SRAT in build_srat().

The relevant command line options to exercise this are below. Nodes 0-1
contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address
space.

  -numa node,nodeid=0,mem=2048M,
  -numa node,nodeid=1,mem=2048M,
  -numa node,nodeid=2,mem=0,
  -object 
memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M
  -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2
  -numa node,nodeid=3,mem=0,
  -object 
memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M
  -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3

Cc: Jingqi Liu 
Cc: Michael S. Tsirkin 
Reviewed-by: Jingqi Liu 
Signed-off-by: Vishal Verma 
---
 hw/acpi/nvdimm.c| 26 ++
 hw/i386/acpi-build.c| 10 ++
 include/hw/mem/nvdimm.h |  1 +
 3 files changed, 37 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 9316d12b70..d322c6a7a7 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -28,6 +28,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/uuid.h"
+#include "qapi/error.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
@@ -1334,6 +1335,31 @@ static void nvdimm_build_ssdt(GArray *table_offsets, 
GArray *table_data,
 free_aml_allocator();
 }
 
+void *nvdimm_build_srat(GArray *table_data)
+{
+AcpiSratMemoryAffinity *numamem = NULL;
+GSList *device_list = nvdimm_get_device_list();
+
+for (; device_list; device_list = device_list->next) {
+DeviceState *dev = device_list->data;
+uint64_t addr, size;
+int node;
+
+node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP,
+   &error_abort);
+addr = object_property_get_uint(OBJECT(dev), PC_DIMM_ADDR_PROP,
+&error_abort);
+size = object_property_get_uint(OBJECT(dev), PC_DIMM_SIZE_PROP,
+&error_abort);
+
+numamem = acpi_data_push(table_data, sizeof *numamem);
+build_srat_memory(numamem, addr, size, node,
+  MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE);
+}
+g_slist_free(device_list);
+return numamem;
+}
+
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
BIOSLinker *linker, NVDIMMState *state,
uint32_t ram_slots)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2e15f6848e..1461d8a718 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2428,6 +2428,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
   MEM_AFFINITY_ENABLED);
 }
 }
+
+if (machine->nvdimms_state->is_enabled) {
+void *ret;
+
+ret = nvdimm_build_srat(table_data);
+if (ret != NULL) {
+numamem = ret;
+}
+}
+
 slots = (table_data->len - numa_start) / sizeof *numamem;
 for (; slots < pcms->numa_nodes + 2; slots++) {
 numamem = acpi_data_push(table_data, sizeof *numamem);
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index a3c08955e8..fbe56509b8 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -155,6 +155,7 @@ typedef struct NVDIMMState NVDIMMState;
 void nvdimm_init_acpi_state(NVDIMMState *state, MemoryRegion *io,
 struct AcpiGenericAddress dsm_io,
 FWCfgState *fw_cfg, Object *owner);
+void *nvdimm_build_srat(GArray *table_data);
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
BIOSLinker *linker, NVDIMMState *state,
uint32_t ram_slots);
-- 
2.26.2

[Bug 1878259] Re: Null-pointer dereference in megasas_handle_frame

2020-05-27 Thread P J P

Upstream patch
  -> https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg07313.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1878259

Title:
  Null-pointer dereference in megasas_handle_frame

Status in QEMU:
  New

Bug description:
  Hello,
  While fuzzing, I found an input that triggers a null-pointer dereference in 
megasas_handle_frame:

  ==1595==ERROR: AddressSanitizer: SEGV on unknown address 0x (pc 
0x55e3e83e6e08 bp 0x7ffdb04c63b0 sp 0x7ffd
  ==1595==The signal is caused by a READ memory access.
  ==1595==Hint: address points to the zero page.
  #0 0x55e3e83e6e08 in megasas_handle_frame 
/home/alxndr/Development/qemu/hw/scsi/megasas.c:1952:36
  #1 0x55e3e83e6e08 in megasas_mmio_write 
/home/alxndr/Development/qemu/hw/scsi/megasas.c:2122:9
  #2 0x55e3e7d088d6 in memory_region_write_accessor 
/home/alxndr/Development/qemu/memory.c:483:5
  #3 0x55e3e7d0827f in access_with_adjusted_size 
/home/alxndr/Development/qemu/memory.c:544:18
  #4 0x55e3e7d0827f in memory_region_dispatch_write 
/home/alxndr/Development/qemu/memory.c:1476:16
  #5 0x55e3e7c1d1d3 in flatview_write_continue 
/home/alxndr/Development/qemu/exec.c:3137:23
  #6 0x55e3e7c15b97 in flatview_write 
/home/alxndr/Development/qemu/exec.c:3177:14
  #7 0x55e3e7c15b97 in address_space_write 
/home/alxndr/Development/qemu/exec.c:3268:18
  #8 0x55e3e7d03bc4 in qtest_process_command 
/home/alxndr/Development/qemu/qtest.c:567:9
  #9 0x55e3e7cfe74d in qtest_process_inbuf 
/home/alxndr/Development/qemu/qtest.c:710:9
  #10 0x55e3e8804cad in fd_chr_read 
/home/alxndr/Development/qemu/chardev/char-fd.c:68:9
  #11 0x7f602ef2a897 in g_main_context_dispatch 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4e897)
  #12 0x55e3e8948384 in glib_pollfds_poll 
/home/alxndr/Development/qemu/util/main-loop.c:219:9
  #13 0x55e3e8948384 in os_host_main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:242:5
  #14 0x55e3e8948384 in main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:518:11
  #15 0x55e3e7f27676 in qemu_main_loop 
/home/alxndr/Development/qemu/softmmu/vl.c:1664:9
  #16 0x55e3e8851c6a in main 
/home/alxndr/Development/qemu/softmmu/main.c:49:5
  #17 0x7f602dadae0a in __libc_start_main 
/build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16
  #18 0x55e3e7b5c7b9 in _start 
(/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x9027b9)

  I can reproduce it in qemu 5.0 using:

  cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -qtest 
stdio -nographic -monitor none -serial none -M q35 -device megasas
  outl 0xcf8 0x80001814
  outl 0xcfc 0xc021
  outl 0xcf8 0x80001818
  outl 0xcf8 0x80001804
  outw 0xcfc 0x7
  outl 0xcf8 0x80001810
  outl 0xcfc 0xe10c
  outl 0xcf8 0x8000f810
  outl 0xcf8 0x8000fa24
  outl 0xcfc 0xe10c4000
  outl 0xcf8 0x8000fa04
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fb20
  write 0xe10c4384 0x15 0x838383838383838383838383838383838383838383
  write 0xc021e10c00c0 0x4 0x082c04dd
  EOF

  I also attached the commands to this launchpad report, in case the
  formatting is broken:

  qemu-system-i386 -qtest stdio -nographic -monitor none -serial none -M
  q35 -device megasas < attachment

  Please let me know if I can provide any further info.
  -Alex

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1878259/+subscriptions

Re: [PATCH v4 5/6] i386: Hyper-V VMBus ACPI DSDT entry

2020-05-27 Thread Jon Doron


On 28/05/2020, Jon Doron wrote:

On 22/05/2020, Igor Mammedow wrote:

On Thu, 21 May 2020 18:02:07 +0200
Paolo Bonzini  wrote:


On 13/05/20 17:34, Igor Mammedov wrote:

I'd rather avoid using random IRQ numbers (considering we are
dealing with black-box here). So if it's really necessary to have
IRQ described here, I'd suggest to implement them in device model
so they would be reserved and QEMU would error out in a sane way if
IRQ conflict is detected.


We don't generally detect ISA IRQ conflicts though, do we?


that I don't know that's why I'm not suggesting how to do it.
The point is hard-coding in AML random IRQs is not right thing to do,
(especially with the lack of 'any' spec), as minimum AML should pull
it from device model and that probably should be configurable and set
by board.

Other thing is:
I haven't looked at VMBus device model in detail, but DSDT part aren't
matching device though (device model is not ISA device hence AML part
shouldn't be on in ISA scope), where to put it is open question.
There were other issues with AML code, I've commented on, so I was
waiting on respin with comments addressed.
I don't think that this patch is good enough for merging.




But it seems like the current patch does match what's Microsoft HyperV 
is publishing in it's APCI tables.


I dont think it's correct for us to "fix" Microsoft emulation even if 
it's wrong, since that's what Windows probably expects to see...


I tried looking where Microsoft uses the ACPI tables to identify the 
VMBus but without much luck in order to understand how flexible a 
change would be for the OS to still detect the VMBus device, but in 
general I think "correcting" something that is emulated 1:1 because 
there is no spec is the right way.




Bah sorry meant to say:
In general correcting some virtual emulated device which is current is 
matching 1:1 is wrong, I think the right way to handle this type of 
things is to copy them 1:1 and not try to "fix" or "correct" them since 
that's what Windows expects.




Paolo

Re: [PATCH v4 5/6] i386: Hyper-V VMBus ACPI DSDT entry

2020-05-27 Thread Jon Doron

On 22/05/2020, Igor Mammedow wrote:

On Thu, 21 May 2020 18:02:07 +0200
Paolo Bonzini  wrote:

On 13/05/20 17:34, Igor Mammedov wrote:
> I'd rather avoid using random IRQ numbers (considering we are
> dealing with black-box here). So if it's really necessary to have
> IRQ described here, I'd suggest to implement them in device model
> so they would be reserved and QEMU would error out in a sane way if
> IRQ conflict is detected.

We don't generally detect ISA IRQ conflicts though, do we?

that I don't know that's why I'm not suggesting how to do it.
The point is hard-coding in AML random IRQs is not right thing to do,
(especially with the lack of 'any' spec), as minimum AML should pull
it from device model and that probably should be configurable and set
by board.

Other thing is:
I haven't looked at VMBus device model in detail, but DSDT part aren't
matching device though (device model is not ISA device hence AML part
shouldn't be on in ISA scope), where to put it is open question.
There were other issues with AML code, I've commented on, so I was
waiting on respin with comments addressed.
I don't think that this patch is good enough for merging.

But it seems like the current patch does match what's Microsoft HyperV 
is publishing in it's APCI tables.

I dont think it's correct for us to "fix" Microsoft emulation even if 
it's wrong, since that's what Windows probably expects to see...

I tried looking where Microsoft uses the ACPI tables to identify the 
VMBus but without much luck in order to understand how flexible a change 
would be for the OS to still detect the VMBus device, but in general 
I think "correcting" something that is emulated 1:1 because there is no 
spec is the right way.

Paolo

Re: [PATCH v6 1/3] memory: drop guest writes to read-only ram device regions

2020-05-27 Thread Paolo Bonzini

On 28/05/20 06:35, Yan Zhao wrote:
> On Tue, May 26, 2020 at 10:26:35AM +0100, Peter Maydell wrote:
>> On Mon, 25 May 2020 at 11:20, Paolo Bonzini  wrote:
>>> Not all of them, only those that need to return MEMTX_ERROR.  I would
>>> like some guidance from Peter as to whether (or when) reads from ROMs
>>> should return MEMTX_ERROR.  This way, we can use that information to
>>> device  what the read-only ram-device regions should do.
>>
>> In general I think writes to ROMs (and indeed reads from ROMs) should
>> not return MEMTX_ERROR. I think that in real hardware you could have
>> a ROM that behaved either way; so our default behaviour should probably
>> be to do what we've always done and not report a MEMTX_ERROR. (If we
>> needed to I suppose we should implement a MEMTX_ERROR-reporting ROM,
>> but to be honest there aren't really many real ROMs in systems these
>> days: it's more often flash, whose response to writes is defined
>> by the spec and is I think to ignore writes which aren't the
>> magic "shift to program-the-flash-mode" sequence.)
>>
> then should I just drop the writes to read-only ram-device regions and
> vfio regions without returning MEMTX_ERROR?
> do you think it's good?

I am not really sure, I have to think more about it.  I think read-only
RAMD regions are slightly different because the guest can expect "magic"
behavior from RAMD regions (e.g. registers that trigger I/O on writes)
that are simply not there for ROM.  So I'm still inclined to queue your
v6 patch series.

Thanks,

Paolo

Re: [PATCH Kernel v23 0/8] Add UAPIs to support migration for VFIO devices

2020-05-27 Thread Yan Zhao



The whole series works for us in general:
Reviewed-by: Yan Zhao 

On Wed, May 20, 2020 at 11:38:00PM +0530, Kirti Wankhede wrote:
> Hi,
> 
> This patch set adds:
> * IOCTL VFIO_IOMMU_DIRTY_PAGES to get dirty pages bitmap with
>   respect to IOMMU container rather than per device. All pages pinned by
>   vendor driver through vfio_pin_pages external API has to be marked as
>   dirty during  migration. When IOMMU capable device is present in the
>   container and all pages are pinned and mapped, then all pages are marked
>   dirty.
>   When there are CPU writes, CPU dirty page tracking can identify dirtied
>   pages, but any page pinned by vendor driver can also be written by
>   device. As of now there is no device which has hardware support for
>   dirty page tracking. So all pages which are pinned should be considered
>   as dirty.
>   This ioctl is also used to start/stop dirty pages tracking for pinned and
>   unpinned pages while migration is active.
> 
> * Updated IOCTL VFIO_IOMMU_UNMAP_DMA to get dirty pages bitmap before
>   unmapping IO virtual address range.
>   With vIOMMU, during pre-copy phase of migration, while CPUs are still
>   running, IO virtual address unmap can happen while device still keeping
>   reference of guest pfns. Those pages should be reported as dirty before
>   unmap, so that VFIO user space application can copy content of those
>   pages from source to destination.
> 
> * Patch 8 detect if IOMMU capable device driver is smart to report pages
>   to be marked dirty by pinning pages using vfio_pin_pages() API.
> 
> 
> Yet TODO:
> Since there is no device which has hardware support for system memmory
> dirty bitmap tracking, right now there is no other API from vendor driver
> to VFIO IOMMU module to report dirty pages. In future, when such hardware
> support will be implemented, an API will be required such that vendor
> driver could report dirty pages to VFIO module during migration phases.
> 
> v22 -> v23
> - Fixed issue reported by Yan
> https://lore.kernel.org/kvm/97977ede-3c5b-c5a5-7858-7eecd7dd5...@nvidia.com/
> - Fixed nit picks suggested by Cornelia
> 
> v21 -> v22
> - Fixed issue raised by Alex :
> https://lore.kernel.org/kvm/20200515163307.72951...@w520.home/
> 
> v20 -> v21
> - Added checkin for GET_BITMAP ioctl for vfio_dma boundaries.
> - Updated unmap ioctl function - as suggested by Alex.
> - Updated comments in DIRTY_TRACKING ioctl definition - as suggested by
>   Cornelia.
> 
> v19 -> v20
> - Fixed ioctl to get dirty bitmap to get bitmap of multiple vfio_dmas
> - Fixed unmap ioctl to get dirty bitmap of multiple vfio_dmas.
> - Removed flag definition from migration capability.
> 
> v18 -> v19
> - Updated migration capability with supported page sizes bitmap for dirty
>   page tracking and  maximum bitmap size supported by kernel module.
> - Added patch to calculate and cache pgsize_bitmap when iommu->domain_list
>   is updated.
> - Removed extra buffers added in previous version for bitmap manipulation
>   and optimised the code.
> 
> v17 -> v18
> - Add migration capability to the capability chain for VFIO_IOMMU_GET_INFO
>   ioctl
> - Updated UMAP_DMA ioctl to return bitmap of multiple vfio_dma
> 
> v16 -> v17
> - Fixed errors reported by kbuild test robot  on i386
> 
> v15 -> v16
> - Minor edits and nit picks (Auger Eric)
> - On copying bitmap to user, re-populated bitmap only for pinned pages,
>   excluding unmapped pages and CPU dirtied pages.
> - Patches are on tag: next-20200318 and 1-3 patches from Yan's series
>   https://lkml.org/lkml/2020/3/12/1255
> 
> v14 -> v15
> - Minor edits and nit picks.
> - In the verification of user allocated bitmap memory, added check of
>maximum size.
> - Patches are on tag: next-20200318 and 1-3 patches from Yan's series
>   https://lkml.org/lkml/2020/3/12/1255
> 
> v13 -> v14
> - Added struct vfio_bitmap to kabi. updated structure
>   vfio_iommu_type1_dirty_bitmap_get and vfio_iommu_type1_dma_unmap.
> - All small changes suggested by Alex.
> - Patches are on tag: next-20200318 and 1-3 patches from Yan's series
>   https://lkml.org/lkml/2020/3/12/1255
> 
> v12 -> v13
> - Changed bitmap allocation in vfio_iommu_type1 to per vfio_dma
> - Changed VFIO_IOMMU_DIRTY_PAGES ioctl behaviour to be per vfio_dma range.
> - Changed vfio_iommu_type1_dirty_bitmap structure to have separate data
>   field.
> 
> v11 -> v12
> - Changed bitmap allocation in vfio_iommu_type1.
> - Remove atomicity of ref_count.
> - Updated comments for migration device state structure about error
>   reporting.
> - Nit picks from v11 reviews
> 
> v10 -> v11
> - Fix pin pages API to free vpfn if it is marked as unpinned tracking page.
> - Added proposal to detect if IOMMU capable device calls external pin pages
>   API to mark pages dirty.
> - Nit picks from v10 reviews
> 
> v9 -> v10:
> - Updated existing VFIO_IOMMU_UNMAP_DMA ioctl to get dirty pages bitmap
>   during unmap while migration is active
> - Added flag in VFIO_IOMMU_GET_INFO to indi

RE: [PATCH v3 4/4] hw/riscv: virt: Allow creating multiple sockets

2020-05-27 Thread Anup Patel



> -Original Message-
> From: Atish Patra 
> Sent: 28 May 2020 07:14
> To: Anup Patel 
> Cc: Peter Maydell ; Palmer Dabbelt
> ; Alistair Francis ; Sagar
> Karandikar ; Atish Patra ;
> qemu-ri...@nongnu.org; qemu-devel@nongnu.org; Anup Patel
> 
> Subject: Re: [PATCH v3 4/4] hw/riscv: virt: Allow creating multiple sockets
> 
> On Wed, May 27, 2020 at 5:14 AM Anup Patel  wrote:
> >
> > We extend RISC-V virt machine to allow creating a multi-socket machine.
> > Each RISC-V virt machine socket is a set of HARTs, a CLINT instance,
> > and a PLIC instance. Other peripherals are shared between all RISC-V
> > virt machine sockets. We also update RISC-V virt machine device tree
> > to treat each socket as a NUMA node.
> >
> 
> But this patch doesn't enable numa options and doesn't allow configuring the
> virt machine as a numa machine (with different memory nodes). IIRC, OpenSBI
> or the plic patches doesn't require the "numa-node-id" as well.
> I think it is better to add the "numa-node-id" along with numa support for 
> virt
> machine.

Since we are moving towards a separate "multi-socket" command-line option.

I think it's good to explore the "-numa" options and implement those instead.

Thanks for the suggestion.

> 
> > By default, multi-socket support is disabled for RISC-V virt machine.
> > To enable multi-socket support, users can pass "multi-socket=on"
> > option in machine name.
> >
> > The number of sockets in RISC-V virt machine can be specified using
> > the "sockets=" sub-option of QEMU "-smp" command-line option.
> >
> > Currently, we only allow creating upto maximum 4 sockets but this
> > limit can be changed in future.
> >
> > Signed-off-by: Anup Patel 
> > ---
> >  hw/riscv/virt.c | 523 +++-
> >  include/hw/riscv/virt.h |  11 +-
> >  2 files changed, 311 insertions(+), 223 deletions(-)
> >
> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c index
> > 421815081d..40a6185b1c 100644
> > --- a/hw/riscv/virt.c
> > +++ b/hw/riscv/virt.c
> > @@ -60,7 +60,7 @@ static const struct MemmapEntry {
> >  [VIRT_TEST] ={   0x10,0x1000 },
> >  [VIRT_RTC] = {   0x101000,0x1000 },
> >  [VIRT_CLINT] =   {  0x200,   0x1 },
> > -[VIRT_PLIC] ={  0xc00, 0x400 },
> > +[VIRT_PLIC] ={  0xc00, VIRT_PLIC_SIZE(VIRT_CPUS_MAX * 2) },
> >  [VIRT_UART0] =   { 0x1000, 0x100 },
> >  [VIRT_VIRTIO] =  { 0x10001000,0x1000 },
> >  [VIRT_FLASH] =   { 0x2000, 0x400 },
> > @@ -182,10 +182,15 @@ static void create_fdt(RISCVVirtState *s, const
> struct MemmapEntry *memmap,
> >  uint64_t mem_size, const char *cmdline)  {
> >  void *fdt;
> > -int cpu, i;
> > -uint32_t *cells;
> > -char *nodename;
> > -uint32_t plic_phandle, test_phandle, phandle = 1;
> > +int i, cpu, socket;
> > +uint32_t *clint_cells, *plic_cells;
> > +unsigned long clint_addr, plic_addr;
> > +uint32_t plic_phandle[VIRT_SOCKETS_MAX];
> > +uint32_t cpu_phandle, intc_phandle, test_phandle;
> > +uint32_t phandle = 1, plic_mmio_phandle = 1;
> > +uint32_t plic_pcie_phandle = 1, plic_virtio_phandle = 1;
> > +char *name, *cpu_name, *core_name, *intc_name;
> > +char *clint_name, *plic_name, *clust_name;
> >  hwaddr flashsize = virt_memmap[VIRT_FLASH].size / 2;
> >  hwaddr flashbase = virt_memmap[VIRT_FLASH].base;
> >
> > @@ -206,231 +211,235 @@ static void create_fdt(RISCVVirtState *s, const
> struct MemmapEntry *memmap,
> >  qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x2);
> >  qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x2);
> >
> > -nodename = g_strdup_printf("/memory@%lx",
> > +name = g_strdup_printf("/memory@%lx",
> >  (long)memmap[VIRT_DRAM].base);
> > -qemu_fdt_add_subnode(fdt, nodename);
> > -qemu_fdt_setprop_cells(fdt, nodename, "reg",
> > +qemu_fdt_add_subnode(fdt, name);
> > +qemu_fdt_setprop_cells(fdt, name, "reg",
> >  memmap[VIRT_DRAM].base >> 32, memmap[VIRT_DRAM].base,
> >  mem_size >> 32, mem_size);
> > -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> > -g_free(nodename);
> > +qemu_fdt_setprop_string(fdt, name, "device_type", "memory");
> > +g_free(name);
> >
> 
> As all other
> 
> >  qemu_fdt_add_subnode(fdt, "/cpus");
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "timebase-frequency",
> >SIFIVE_CLINT_TIMEBASE_FREQ);
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
> > +qemu_fdt_add_subnode(fdt, "/cpus/cpu-map");
> > +
> > +for (socket = (s->num_socs - 1); socket >= 0; socket--) {
> > +clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
> > +qemu_fdt_add_subnode(fdt, clust_name);
> > +
> > +plic_cells = g_new0(uint32_t, s->soc[

Re: [PATCH v6 1/3] memory: drop guest writes to read-only ram device regions

2020-05-27 Thread Yan Zhao

On Tue, May 26, 2020 at 10:26:35AM +0100, Peter Maydell wrote:
> On Mon, 25 May 2020 at 11:20, Paolo Bonzini  wrote:
> > Not all of them, only those that need to return MEMTX_ERROR.  I would
> > like some guidance from Peter as to whether (or when) reads from ROMs
> > should return MEMTX_ERROR.  This way, we can use that information to
> > device  what the read-only ram-device regions should do.
> 
> In general I think writes to ROMs (and indeed reads from ROMs) should
> not return MEMTX_ERROR. I think that in real hardware you could have
> a ROM that behaved either way; so our default behaviour should probably
> be to do what we've always done and not report a MEMTX_ERROR. (If we
> needed to I suppose we should implement a MEMTX_ERROR-reporting ROM,
> but to be honest there aren't really many real ROMs in systems these
> days: it's more often flash, whose response to writes is defined
> by the spec and is I think to ignore writes which aren't the
> magic "shift to program-the-flash-mode" sequence.)
>
then should I just drop the writes to read-only ram-device regions and
vfio regions without returning MEMTX_ERROR?
do you think it's good?

Thanks
Yan

[PATCH v3 1/2] PCI: vmd: Filter resource type bits from shadow register

2020-05-27 Thread Jon Derrick

Versions of VMD with the Host Physical Address shadow register use this
register to calculate the bus address offset needed to do guest
passthrough of the domain. This register shadows the Host Physical
Address registers including the resource type bits. After calculating
the offset, the extra resource type bits lead to the VMD resources being
over-provisioned at the front and under-provisioned at the back.

Example:
pci 1:80:02.0: reg 0x10: [mem 0xf801fffc-0xf803fffb 64bit]

Expected:
pci 1:80:02.0: reg 0x10: [mem 0xf802-0xf803 64bit]

If other devices are mapped in the over-provisioned front, it could lead
to resource conflict issues with VMD or those devices.

Fixes: a1a30170138c9 ("PCI: vmd: Fix shadow offsets to reflect spec changes")
Signed-off-by: Jon Derrick 
---
 drivers/pci/controller/vmd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index dac91d6..e386d4e 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -445,9 +445,11 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned 
long features)
if (!membar2)
return -ENOMEM;
offset[0] = vmd->dev->resource[VMD_MEMBAR1].start -
-   readq(membar2 + MB2_SHADOW_OFFSET);
+   (readq(membar2 + MB2_SHADOW_OFFSET) &
+PCI_BASE_ADDRESS_MEM_MASK);
offset[1] = vmd->dev->resource[VMD_MEMBAR2].start -
-   readq(membar2 + MB2_SHADOW_OFFSET + 8);
+   (readq(membar2 + MB2_SHADOW_OFFSET + 8) 
&
+PCI_BASE_ADDRESS_MEM_MASK);
pci_iounmap(vmd->dev, membar2);
}
}
-- 
1.8.3.1

[PATCH v3 2/2] PCI: vmd: Use Shadow MEMBAR registers for QEMU/KVM guests

2020-05-27 Thread Jon Derrick

VMD device 28C0 natively assists guest passthrough of the VMD endpoint
through the use of shadow registers that provide Host Physical Addresses
to correctly assign bridge windows. These shadow registers are only
available if VMD config space register 0x70, bit 1 is set.

In order to support this mode in existing VMD devices which don't
natively support the shadow register, it was decided that the hypervisor
could offer the shadow registers in a vendor-specific PCI capability.

QEMU has been modified to create this vendor-specific capability and
supply the shadow membar registers for VMDs which don't natively support
this feature. This patch adds this mode and updates the supported device
list to allow this feature to be used on these VMDs.

Signed-off-by: Jon Derrick 
---
 drivers/pci/controller/vmd.c | 44 ++--
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index e386d4e..76d8acb 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -40,13 +40,19 @@ enum vmd_features {
 * membars, in order to allow proper address translation during
 * resource assignment to enable guest virtualization
 */
-   VMD_FEAT_HAS_MEMBAR_SHADOW  = (1 << 0),
+   VMD_FEAT_HAS_MEMBAR_SHADOW  = (1 << 0),
 
/*
 * Device may provide root port configuration information which limits
 * bus numbering
 */
-   VMD_FEAT_HAS_BUS_RESTRICTIONS   = (1 << 1),
+   VMD_FEAT_HAS_BUS_RESTRICTIONS   = (1 << 1),
+
+   /*
+* Device contains physical location shadow registers in
+* vendor-specific capability space
+*/
+   VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP= (1 << 2),
 };
 
 /*
@@ -454,6 +460,28 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned 
long features)
}
}
 
+   if (features & VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP) {
+   int pos = pci_find_capability(vmd->dev, PCI_CAP_ID_VNDR);
+   u32 reg, regu;
+
+   pci_read_config_dword(vmd->dev, pos + 4, ®);
+
+   /* "SHDW" */
+   if (pos && reg == 0x53484457) {
+   pci_read_config_dword(vmd->dev, pos + 8, ®);
+   pci_read_config_dword(vmd->dev, pos + 12, ®u);
+   offset[0] = vmd->dev->resource[VMD_MEMBAR1].start -
+   (((u64) regu << 32 | reg) &
+PCI_BASE_ADDRESS_MEM_MASK);
+
+   pci_read_config_dword(vmd->dev, pos + 16, ®);
+   pci_read_config_dword(vmd->dev, pos + 20, ®u);
+   offset[1] = vmd->dev->resource[VMD_MEMBAR2].start -
+   (((u64) regu << 32 | reg) &
+PCI_BASE_ADDRESS_MEM_MASK);
+   }
+   }
+
/*
 * Certain VMD devices may have a root port configuration option which
 * limits the bus range to between 0-127, 128-255, or 224-255
@@ -716,16 +744,20 @@ static int vmd_resume(struct device *dev)
 static SIMPLE_DEV_PM_OPS(vmd_dev_pm_ops, vmd_suspend, vmd_resume);
 
 static const struct pci_device_id vmd_ids[] = {
-   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D),},
+   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D),
+   .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP,},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0),
.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW |
VMD_FEAT_HAS_BUS_RESTRICTIONS,},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
-   .driver_data = VMD_FEAT_HAS_BUS_RESTRICTIONS,},
+   .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
+   VMD_FEAT_HAS_BUS_RESTRICTIONS,},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4c3d),
-   .driver_data = VMD_FEAT_HAS_BUS_RESTRICTIONS,},
+   .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
+   VMD_FEAT_HAS_BUS_RESTRICTIONS,},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
-   .driver_data = VMD_FEAT_HAS_BUS_RESTRICTIONS,},
+   .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
+   VMD_FEAT_HAS_BUS_RESTRICTIONS,},
{0,}
 };
 MODULE_DEVICE_TABLE(pci, vmd_ids);
-- 
1.8.3.1

[PATCH v3 0/2] VMD endpoint passthrough support

2020-05-27 Thread Jon Derrick

This set contains 2 patches for Linux and 1 for QEMU. VMD device
8086:28C0 contains information in registers to assist with direct
assignment passthrough. Several other VMD devices don't have this
information, but hypervisors can easily provide the guest with this
information through various means.

The QEMU patch provides the information in an emulated vendor-specific
PCI capability. Existing VMD devices don't conflict with the offset
chosen for the capability.

The Linux patch allows guest kernels to use the passthrough information
emulated by the QEMU patch, by matching against the vendor-specific PCI
capability if it exists.

V2 Ref:
https://lore.kernel.org/linux-pci/20200511190129.9313-1-jonathan.derr...@intel.com/

Changes from v2:
Uses vendor-specific PCI capability rather than emulating the 28C0
MEMBAR/VMLOCK modes.

Changes from v1:
v1 changed the VMD Subsystem ID to QEMU's so that the guest driver could
match against it. This was unnecessary as the VMLOCK register and shadow
membar registers could be safely emulated. Future VMDs will be aligned
on these register bits.

Jon Derrick (2):
  PCI: vmd: Filter resource type bits from shadow register
  PCI: vmd: Use Shadow MEMBAR registers for QEMU/KVM guests

 drivers/pci/controller/vmd.c | 50 +---
 1 file changed, 42 insertions(+), 8 deletions(-)

-- 
1.8.3.1

[PATCH v3 FOR QEMU v3] hw/vfio: Add VMD Passthrough Quirk

2020-05-27 Thread Jon Derrick

The VMD endpoint provides a real PCIe domain to the guest, including
bridges and endpoints. Because the VMD domain is enumerated by the guest
kernel, the guest kernel will assign Guest Physical Addresses to the
downstream endpoint BARs and bridge windows.

When the guest kernel performs MMIO to VMD sub-devices, MMU will
translate from the guest address space to the physical address space.
Because the bridges have been programmed with guest addresses, the
bridges will reject the transaction containing physical addresses.

VMD device 28C0 natively assists passthrough by providing the Host
Physical Address in shadow registers accessible to the guest for bridge
window assignment. The shadow registers are valid if bit 1 is set in VMD
VMLOCK config register 0x70.

In order to support existing VMDs, this quirk provides the shadow
registers in a vendor-specific PCI capability to the vfio-passthrough
device for all VMD device ids which don't natively assist with
passthrough. The Linux VMD driver is updated to check for this new
vendor-specific capability.

Signed-off-by: Jon Derrick 
---
 hw/vfio/pci-quirks.c | 84 +---
 1 file changed, 72 insertions(+), 12 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 3bd05fed12..6b8c1edfd5 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1567,18 +1567,6 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice 
*vdev, Error **errp)
 return 0;
 }
 
-int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
-{
-int ret;
-
-ret = vfio_add_nv_gpudirect_cap(vdev, errp);
-if (ret) {
-return ret;
-}
-
-return 0;
-}
-
 static void vfio_pci_nvlink2_get_tgt(Object *obj, Visitor *v,
  const char *name,
  void *opaque, Error **errp)
@@ -1709,3 +1697,75 @@ free_exit:
 
 return ret;
 }
+
+/*
+ * The VMD endpoint provides a real PCIe domain to the guest and the guest
+ * kernel performs enumeration of the VMD sub-device domain. Guest transactions
+ * to VMD sub-devices go through MMU translation from guest addresses to
+ * physical addresses. When MMIO goes to an endpoint after being translated to
+ * physical addresses, the bridge rejects the transaction because the window
+ * has been programmed with guest addresses.
+ *
+ * VMD can use the Host Physical Address in order to correctly program the
+ * bridge windows in its PCIe domain. VMD device 28C0 has HPA shadow registers
+ * located at offset 0x2000 in MEMBAR2 (BAR 4). This quirk provides the HPA
+ * shadow registers in a vendor-specific capability register for devices
+ * without native support. The position of 0xE8-0xFF is in the reserved range
+ * of the VMD device capability space following the Power Management
+ * Capability.
+ */
+#define VMD_SHADOW_CAP_VER 1
+#define VMD_SHADOW_CAP_LEN 24
+static int vfio_add_vmd_shadow_cap(VFIOPCIDevice *vdev, Error **errp)
+{
+uint8_t membar_phys[16];
+int ret, pos = 0xE8;
+
+if (!(vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x201D) ||
+  vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x467F) ||
+  vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x4C3D) ||
+  vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x9A0B))) {
+return 0;
+}
+
+ret = pread(vdev->vbasedev.fd, membar_phys, 16,
+vdev->config_offset + PCI_BASE_ADDRESS_2);
+if (ret != 16) {
+error_report("VMD %s cannot read MEMBARs (%d)",
+ vdev->vbasedev.name, ret);
+return -EFAULT;
+}
+
+ret = pci_add_capability(&vdev->pdev, PCI_CAP_ID_VNDR, pos,
+ VMD_SHADOW_CAP_LEN, errp);
+if (ret < 0) {
+error_prepend(errp, "Failed to add VMD MEMBAR Shadow cap: ");
+return ret;
+}
+
+memset(vdev->emulated_config_bits + pos, 0xFF, VMD_SHADOW_CAP_LEN);
+pos += PCI_CAP_FLAGS;
+pci_set_byte(vdev->pdev.config + pos++, VMD_SHADOW_CAP_LEN);
+pci_set_byte(vdev->pdev.config + pos++, VMD_SHADOW_CAP_VER);
+pci_set_long(vdev->pdev.config + pos, 0x53484457); /* SHDW */
+memcpy(vdev->pdev.config + pos + 4, membar_phys, 16);
+
+return 0;
+}
+
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
+{
+int ret;
+
+ret = vfio_add_nv_gpudirect_cap(vdev, errp);
+if (ret) {
+return ret;
+}
+
+ret = vfio_add_vmd_shadow_cap(vdev, errp);
+if (ret) {
+return ret;
+}
+
+return 0;
+}
-- 
2.18.1

Re: [PATCH v4 10/10] target/riscv: Use a smaller guess size for no-MMU PMP

2020-05-27 Thread Bin Meng

On Thu, May 28, 2020 at 12:59 AM Alistair Francis
 wrote:
>
> Signed-off-by: Alistair Francis 
> ---
>  target/riscv/pmp.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
>

Reviewed-by: Bin Meng

Re: [PATCH v4 03/10] target/riscv: Disable the MMU correctly

2020-05-27 Thread Bin Meng

On Thu, May 28, 2020 at 12:58 AM Alistair Francis
 wrote:
>
> Previously if we didn't enable the MMU it would be enabled in the
> realize() function anyway. Let's ensure that if we don't want the MMU we
> disable it. We also don't need to enable the MMU as it will be enalbed
> in realize() by default.
>

I think we should do the same for the PMP feature as the logic is the
same as MMU: PMP is always enabled in the realize() function

> Signed-off-by: Alistair Francis 
> ---
>  target/riscv/cpu.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 5eb3c02735..8deba3d16d 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -142,7 +142,6 @@ static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
>  set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
>  set_priv_version(env, PRIV_VERSION_1_09_1);
>  set_resetvec(env, DEFAULT_RSTVEC);
> -set_feature(env, RISCV_FEATURE_MMU);
>  set_feature(env, RISCV_FEATURE_PMP);
>  }
>
> @@ -152,7 +151,6 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
>  set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
> -set_feature(env, RISCV_FEATURE_MMU);
>  set_feature(env, RISCV_FEATURE_PMP);
>  }
>
> @@ -163,6 +161,7 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
>  set_feature(env, RISCV_FEATURE_PMP);
> +qdev_prop_set_bit(DEVICE(obj), "mmu", false);
>  }
>
>  static void rv32imafcu_nommu_cpu_init(Object *obj)
> @@ -172,6 +171,7 @@ static void rv32imafcu_nommu_cpu_init(Object *obj)
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
>  set_feature(env, RISCV_FEATURE_PMP);
> +qdev_prop_set_bit(DEVICE(obj), "mmu", false);
>  }
>
>  #elif defined(TARGET_RISCV64)
> @@ -190,7 +190,6 @@ static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
>  set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
>  set_priv_version(env, PRIV_VERSION_1_09_1);
>  set_resetvec(env, DEFAULT_RSTVEC);
> -set_feature(env, RISCV_FEATURE_MMU);
>  set_feature(env, RISCV_FEATURE_PMP);
>  }
>
> @@ -200,7 +199,6 @@ static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
>  set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
> -set_feature(env, RISCV_FEATURE_MMU);
>  set_feature(env, RISCV_FEATURE_PMP);
>  }
>
> @@ -211,6 +209,7 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
>  set_feature(env, RISCV_FEATURE_PMP);
> +qdev_prop_set_bit(DEVICE(obj), "mmu", false);
>  }
>

Regards,
Bin

Re: [PATCH v3 4/4] hw/riscv: virt: Allow creating multiple sockets

2020-05-27 Thread Atish Patra

On Wed, May 27, 2020 at 5:14 AM Anup Patel  wrote:
>
> We extend RISC-V virt machine to allow creating a multi-socket machine.
> Each RISC-V virt machine socket is a set of HARTs, a CLINT instance,
> and a PLIC instance. Other peripherals are shared between all RISC-V
> virt machine sockets. We also update RISC-V virt machine device tree
> to treat each socket as a NUMA node.
>

But this patch doesn't enable numa options and doesn't allow configuring the
virt machine as a numa machine (with different memory nodes). IIRC, OpenSBI
or the plic patches doesn't require the "numa-node-id" as well.
I think it is better to add the "numa-node-id" along with numa support
for virt machine.

> By default, multi-socket support is disabled for RISC-V virt machine.
> To enable multi-socket support, users can pass "multi-socket=on" option
> in machine name.
>
> The number of sockets in RISC-V virt machine can be specified using
> the "sockets=" sub-option of QEMU "-smp" command-line option.
>
> Currently, we only allow creating upto maximum 4 sockets but this
> limit can be changed in future.
>
> Signed-off-by: Anup Patel 
> ---
>  hw/riscv/virt.c | 523 +++-
>  include/hw/riscv/virt.h |  11 +-
>  2 files changed, 311 insertions(+), 223 deletions(-)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 421815081d..40a6185b1c 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -60,7 +60,7 @@ static const struct MemmapEntry {
>  [VIRT_TEST] ={   0x10,0x1000 },
>  [VIRT_RTC] = {   0x101000,0x1000 },
>  [VIRT_CLINT] =   {  0x200,   0x1 },
> -[VIRT_PLIC] ={  0xc00, 0x400 },
> +[VIRT_PLIC] ={  0xc00, VIRT_PLIC_SIZE(VIRT_CPUS_MAX * 2) },
>  [VIRT_UART0] =   { 0x1000, 0x100 },
>  [VIRT_VIRTIO] =  { 0x10001000,0x1000 },
>  [VIRT_FLASH] =   { 0x2000, 0x400 },
> @@ -182,10 +182,15 @@ static void create_fdt(RISCVVirtState *s, const struct 
> MemmapEntry *memmap,
>  uint64_t mem_size, const char *cmdline)
>  {
>  void *fdt;
> -int cpu, i;
> -uint32_t *cells;
> -char *nodename;
> -uint32_t plic_phandle, test_phandle, phandle = 1;
> +int i, cpu, socket;
> +uint32_t *clint_cells, *plic_cells;
> +unsigned long clint_addr, plic_addr;
> +uint32_t plic_phandle[VIRT_SOCKETS_MAX];
> +uint32_t cpu_phandle, intc_phandle, test_phandle;
> +uint32_t phandle = 1, plic_mmio_phandle = 1;
> +uint32_t plic_pcie_phandle = 1, plic_virtio_phandle = 1;
> +char *name, *cpu_name, *core_name, *intc_name;
> +char *clint_name, *plic_name, *clust_name;
>  hwaddr flashsize = virt_memmap[VIRT_FLASH].size / 2;
>  hwaddr flashbase = virt_memmap[VIRT_FLASH].base;
>
> @@ -206,231 +211,235 @@ static void create_fdt(RISCVVirtState *s, const 
> struct MemmapEntry *memmap,
>  qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x2);
>  qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x2);
>
> -nodename = g_strdup_printf("/memory@%lx",
> +name = g_strdup_printf("/memory@%lx",
>  (long)memmap[VIRT_DRAM].base);
> -qemu_fdt_add_subnode(fdt, nodename);
> -qemu_fdt_setprop_cells(fdt, nodename, "reg",
> +qemu_fdt_add_subnode(fdt, name);
> +qemu_fdt_setprop_cells(fdt, name, "reg",
>  memmap[VIRT_DRAM].base >> 32, memmap[VIRT_DRAM].base,
>  mem_size >> 32, mem_size);
> -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -g_free(nodename);
> +qemu_fdt_setprop_string(fdt, name, "device_type", "memory");
> +g_free(name);
>

As all other

>  qemu_fdt_add_subnode(fdt, "/cpus");
>  qemu_fdt_setprop_cell(fdt, "/cpus", "timebase-frequency",
>SIFIVE_CLINT_TIMEBASE_FREQ);
>  qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
>  qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
> +qemu_fdt_add_subnode(fdt, "/cpus/cpu-map");
> +
> +for (socket = (s->num_socs - 1); socket >= 0; socket--) {
> +clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
> +qemu_fdt_add_subnode(fdt, clust_name);
> +
> +plic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
> +clint_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
>
> -for (cpu = s->soc.num_harts - 1; cpu >= 0; cpu--) {
> -int cpu_phandle = phandle++;
> -int intc_phandle;
> -nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
> -char *intc = g_strdup_printf("/cpus/cpu@%d/interrupt-controller", 
> cpu);
> -char *isa = riscv_isa_string(&s->soc.harts[cpu]);
> -qemu_fdt_add_subnode(fdt, nodename);
> +for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
> +cpu_phandle = phandle++;
> +
> +cpu_name = g_strdup_printf("/cpus/cpu@%d",
> +s->soc[socket].hartid_base + cpu);

Re: [PATCH 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT

2020-05-27 Thread Verma, Vishal L

On Thu, 2020-05-21 at 17:16 +0200, Igor Mammedov wrote:

Hi Igor, Thanks for the review.

[..]
> >  
> > @@ -2429,6 +2430,25 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
> > MachineState *machine)
> >MEM_AFFINITY_ENABLED);
> >  }
> >  }
> > +
> > +if (machine->nvdimms_state->is_enabled) {
> > +GSList *device_list = nvdimm_get_device_list();
> > +
> > +for (; device_list; device_list = device_list->next) {
> > +DeviceState *dev = device_list->data;
> > +int node = object_property_get_int(OBJECT(dev), 
> > PC_DIMM_NODE_PROP,
> > +   NULL);
> > +uint64_t addr = object_property_get_uint(OBJECT(dev),
> > + PC_DIMM_ADDR_PROP, 
> > NULL);
> > +uint64_t size = object_property_get_uint(OBJECT(dev),
> > + PC_DIMM_SIZE_PROP, 
> > NULL);
> > +
> suggest to use error_abort in getters

Yep, fixed in v2.

> 
> > +numamem = acpi_data_push(table_data, sizeof *numamem);
> > +build_srat_memory(numamem, addr, size, node,
> > +  MEM_AFFINITY_ENABLED | 
> > MEM_AFFINITY_NON_VOLATILE);
> > +}
> who is in charge of freeing device_list ?

Thanks, also fixed in v2.

> 
> > +}
> 
> There is ARM version of build_srat(),
> I suggest to put this NVDIMM specific part in helper function within 
> hw/acpi/nvdimm.c
> and use it from both build_srat() functions.

Splitting the work out into a helper function in nvdimm.c does make
sense, and I've done that. However, looking at the arm version of
build_srat and generally in virt-acpi-build.c, I don't see any NVDIMM
support, so unless I'm mistaken, it wouldn't make sense to actually call
this from the arm version of build_srat.

I'll send a v2 with the above fixes.

> 
> > +
> >  slots = (table_data->len - numa_start) / sizeof *numamem;
> >  for (; slots < pcms->numa_nodes + 2; slots++) {
> >  numamem = acpi_data_push(table_data, sizeof *numamem);

Re: GDB get wrong debug infos on TI DSP architecture extension

2020-05-27 Thread casmac

Hi,
   Thank you for forwarding my question to developers and sharing the 
C6x implementation.
   Perhaps I should follow up with another problem I encountered. The 
senerio is the  emulator keeps running eventhough the program it emulates 
has already exited. And it keeps retrieving instructions which are all zero 
"instruction"(0x). 

   It looks to me that in function cpu_exec(CPUState *cpu), the 
following loop never terminate:
   while (!cpu_handle_exception(cpu, &ret)) {
    TranslationBlock *last_tb = NULL;
    int tb_exit = 0;
    while (!cpu_handle_interrupt(cpu, 
&last_tb)) { ... }
   Is it because cpu->exit_request remains 0 ?

   At what point should we make cpu->exit_request=1 ?
   Thanks again!!


regards
xiaolei



-- Original --
From: "Philippe Mathieu-Daudé"https://github.com/philmd/qemu/releases/tag/target-c6x-2.4

I started rebasing it to QEMU 4.2 but then got distracted.

> Now, we are
> adding GDB debugging features.
>    We have done the following, but not sure we are on the right 
track :
>    - add a xml description file in gdb-xml, without 
understanding the
> purpose of the file, why some architectures don't provide such xml file?
>    - add ***_cpu_gdb_read_register(), 
***_cpu_gdb_write_register();
>    - added  dsp_cpu_get_phys_page_attrs_debug(), but 
uncertain about
> what to return
>      dsp_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr 
addr,
> MemTxAttrs *attrs)
>      {
>         return addr & TARGET_PAGE_MASK;  
>      }
>   
>    We run QEMU with the these arguments
>    qemu-system-dsp ... -kernel filename.out -S -s
>   
>    It turns out that gdb reads incorrect register values, and 
complains
> : "warning: Target-supplied registers are not supported by the current
> architecture".
>   
>    Something is missing here, or we do it in a wrong way.  
Any advise
> would be helpful to us.
>   
>    Thanks.
>    
> xiaolei
> 
>    - ti_dsp.xml  -
>   
>

Re: [PATCH v5 0/7] coroutines: generate wrapper code

2020-05-27 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200527203733.16129-1-vsement...@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'
  CC  /tmp/qemu-test/build/slirp/src/arp_table.o
make: *** [block/block-gen.c] Error 1
make: *** Deleting file `block/block-gen.c'
  CC  /tmp/qemu-test/build/slirp/src/util.o
make: *** Waiting for unfinished jobs
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=e9801db6c81946b8b7007cd3179a6d18', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-eja8hzsv/src/docker-src.2020-05-27-18.23.44.9751:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=e9801db6c81946b8b7007cd3179a6d18
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-eja8hzsv/src'
make: *** [docker-run-test-quick@centos7] Error 2

real2m1.393s
user0m7.847s


The full log is available at
http://patchew.org/logs/20200527203733.16129-1-vsement...@virtuozzo.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v5 1/7] block: return error-code from bdrv_invalidate_cache

2020-05-27 Thread Eric Blake


On 5/27/20 3:37 PM, Vladimir Sementsov-Ogievskiy wrote:

This is the only coroutine wrapper from block.c and block/io.c which
doesn't return value, so let's convert it to the common behavior, to


s/value/a value/


simplify moving to generated coroutine wrappers in further commit.


s/in/in a/



Also, bdrv_invalidate_cache is a void function, returning error only
through **errp parameter, which considered to be bad practice, as it


s/which/which is/


forces callers to define and propagate local_err variable, so
conversion is good anyway.

This patch leaves convertion of .bdrv_co_invalidate_cache() driver


s/convertion/the conversion/


callbacks and bdrv_invalidate_cache_all() for another-day refactoring.


s/another-day refactoring/another day/



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block.h |  2 +-
  block.c   | 32 ++--
  2 files changed, 19 insertions(+), 15 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v5 0/7] coroutines: generate wrapper code

2020-05-27 Thread Eric Blake


On 5/27/20 4:46 PM, no-re...@patchew.org wrote:

Patchew URL: 
https://patchew.org/QEMU/20200527203733.16129-1-vsement...@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

TypeError: __init__() got an unexpected keyword argument 'capture_output'
   CC  /tmp/qemu-test/build/slirp/src/bootp.o
   GEN ui/input-keymap-usb-to-qcode.c
make: *** [block/block-gen.c] Error 1
make: *** Deleting file `block/block-gen.c'
make: *** Waiting for unfinished jobs
   GEN ui/input-keymap-win32-to-qcode.c


The more interesting part of the failure:

  File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 173, in 


print(gen_wrappers_file(sys.stdin.read()))
  File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 169, in 
gen_wrappers_file

return prettify(res)  # prettify to wrap long lines
  File "/tmp/qemu-test/src/scripts/coroutine-wrapper.py", line 40, in 
prettify

encoding='utf-8', input=code, capture_output=True)
  File "/usr/lib64/python3.6/subprocess.py", line 423, in run
with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'capture_output'

which indeed looks like a bug in the patch.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Bug 1881004] [NEW] fpu/softfloat.c: error: bitwise negation of a boolean expression

2020-05-27 Thread Eric Blake


On 5/27/20 4:40 PM, Peter Maydell wrote:

On Wed, 27 May 2020 at 20:21, Philippe Mathieu-Daudé
<1881...@bugs.launchpad.net> wrote:


Public bug reported:

Last time I built QEMU was on commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae,
I just pulled to fea8f3ed739536fca027cf56af7f5576f37ef9cd and now get:

   CC  lm32-softmmu/fpu/softfloat.o
fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
 absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
 ^~
 !



"(x & y)" is not a boolean expression, so we should report this to clang
as a bug (I assume what they actually are trying to complain about is
bitwise AND with a boolean expression).


We have:

uint64_t &= ~ ( ( ( int8_t ^ int ) == int ) & bool )

which is

uint64_t &= ~ ( bool & bool )

which is then

uint64_t &= ~ ( int )

resulting in one of:

uint64_t &= 0x
uint64_t &= 0xfffe

It is a very odd way of stating that 'if this condition is true, mask 
out the least-significant-bit'.  In general, 'bool & bool' is used where 
the side-effect-skipping 'bool && bool' is inappropriate; I'm a bit 
surprised that clang is not questioning whether we meant '&&' instead of 
'&' (the two operators give the same effect in this case).


You are right that clang is fishy for calling it logical negation of a 
bool, when it is really logical negation of an int, but we are also 
fishy in that we are using bitwise AND of two bools as an int in the 
first place.


Regardless of whether clang changes, would it be better to write the 
code as:


if (((roundBits ^ 0x40) == 0) && roundNearestEven) {
absZ &= ~1;
}

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v5 0/7] coroutines: generate wrapper code

2020-05-27 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200527203733.16129-1-vsement...@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

TypeError: __init__() got an unexpected keyword argument 'capture_output'
  CC  /tmp/qemu-test/build/slirp/src/bootp.o
  GEN ui/input-keymap-usb-to-qcode.c
make: *** [block/block-gen.c] Error 1
make: *** Deleting file `block/block-gen.c'
make: *** Waiting for unfinished jobs
  GEN ui/input-keymap-win32-to-qcode.c
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=7a4c9c87bb7b4b61ae99142b8ccd4c12', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-vw5w7k6z/src/docker-src.2020-05-27-17.44.40.573:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=7a4c9c87bb7b4b61ae99142b8ccd4c12
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-vw5w7k6z/src'
make: *** [docker-run-test-quick@centos7] Error 2

real1m57.906s
user0m8.475s


The full log is available at
http://patchew.org/logs/20200527203733.16129-1-vsement...@virtuozzo.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

RE: GDB get wrong debug infos on TI DSP architecture extension

2020-05-27 Thread Taylor Simpson

For Hexagon, we have LLDB, not GDB.  I tinkered with getting LLDB to talk to 
qemu but never got if fully functional.  I'm planning to get back to it at some 
point.

With that caveat, I'll try to answer Xiaolei's questions
- The xml file is returned from qemu to gdb in response to the 
Xfer:features:read command.  Providing it should be optional unless your 
debugger requires the target to support that command.  If the target doesn't 
support this command, the debugger will generally use the qRegisterInfo command.
- I don't think get_phys_page_debug is used for gdb debugging.  Which mode are 
you implementing?  In linux-user mode, it's not needed.  In softmmu mode, it is 
used when you use "-d in_asm" to find the memory to disassemble.  If you have 
an MMU, you need to map the virtual address passed in to the physical address 
or return -1 if there is no mapping.  If there isn't a MMU, return the virtual 
address.
- The error you are getting from gdb sounds like a mismatch between the version 
of the processor that qemu is emulating and gdb thinks it is debugging.  In 
other words, qemu thinks there is a register that gdb know about.  To see 
what's going on, try adding "-d 
trace:gdbstub_io_command,trace:gdbstub_io_reply" to your qemu command line.  
This will show you the commands from gdb and qemu's response.  Look for the 
commands I described above and see if qemu is giving a register that doesn't 
exist.

HTH,
Taylor


> -Original Message-
> From: Philippe Mathieu-Daudé  On
> Behalf Of Philippe Mathieu-Daudé
> Sent: Wednesday, May 27, 2020 2:20 AM
> To: casmac <1482995...@qq.com>; qemu-devel  de...@nongnu.org>
> Cc: Luc Michel ; Alex Bennée
> ; Taylor Simpson 
> Subject: Re: GDB get wrong debug infos on TI DSP architecture extension
>
>
> Hi Xiaolei,
>
> Cc'ing more developers who might answer you.
>
> On 5/27/20 8:48 AM, casmac wrote:
> > Hi all,
> >I am working on a TI DSP architecture extension for QEMU.
>
> FYI you can find the TI TMS320C6x target implemented here:
> https://github.com/philmd/qemu/releases/tag/target-c6x-2.4
>
> I started rebasing it to QEMU 4.2 but then got distracted.
>
> > Now, we are
> > adding GDB debugging features.
> >We have done the following, but not sure we are on the right track :
> >- add a xml description file in gdb-xml, without understanding the
> > purpose of the file, why some architectures don't provide such xml file?
> >- add ***_cpu_gdb_read_register(), ***_cpu_gdb_write_register();
> >- added  dsp_cpu_get_phys_page_attrs_debug(), but uncertain about
> > what to return
> >  dsp_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
> > MemTxAttrs *attrs)
> >  {
> > return addr & TARGET_PAGE_MASK;
> >  }
> >
> >We run QEMU with the these arguments
> >qemu-system-dsp ... -kernel filename.out -S -s
> >
> >It turns out that gdb reads incorrect register values, and complains
> > : "warning: Target-supplied registers are not supported by the current
> > architecture".
> >
> >Something is missing here, or we do it in a wrong way.  Any advise
> > would be helpful to us.
> >
> >Thanks.
> >
> > xiaolei
> >
> >- ti_dsp.xml  -
> >
> >
> > 
> > 
> > 
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > 
> >
>

Re: [Bug 1881004] [NEW] fpu/softfloat.c: error: bitwise negation of a boolean expression

2020-05-27 Thread Peter Maydell

On Wed, 27 May 2020 at 20:21, Philippe Mathieu-Daudé
<1881...@bugs.launchpad.net> wrote:
>
> Public bug reported:
>
> Last time I built QEMU was on commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae,
> I just pulled to fea8f3ed739536fca027cf56af7f5576f37ef9cd and now get:
>
>   CC  lm32-softmmu/fpu/softfloat.o
> fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did 
> you mean logical negation? [-Werror,-Wbool-operation]
> absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
> ^~
> !


"(x & y)" is not a boolean expression, so we should report this to clang
as a bug (I assume what they actually are trying to complain about is
bitwise AND with a boolean expression).

thanks
-- PMM

[PULL v2 00/11] bitmaps patches for 2020-05-26

2020-05-27 Thread Eric Blake

The following changes since commit 06539ebc76b8625587aa78d646a9d8d5fddf84f3:

  Merge remote-tracking branch 
'remotes/philmd-gitlab/tags/mips-hw-next-20200526' into staging (2020-05-26 
20:25:06 +0100)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-bitmaps-2020-05-26-v2

for you to fetch changes up to 958796e47d3af10ece329294a7bb41d5853667db:

  iotests: Add test 291 to for qemu-img bitmap coverage (2020-05-27 16:19:59 
-0500)

v2: fix iotest 190 to not be as sensitive to different sparseness of
qcow2 file on various filesystems, such as FreeBSD (sending only the
changed patch)


bitmaps patches for 2020-05-26

- fix non-blockdev migration of bitmaps when mirror job is in use
- add bitmap sizing to 'qemu-img measure'
- add 'qemu-img convert --bitmaps'


Eric Blake (5):
  iotests: Fix test 178
  qcow2: Expose bitmaps' size during measure
  qemu-img: Factor out code for merging bitmaps
  qemu-img: Add convert --bitmaps option
  iotests: Add test 291 to for qemu-img bitmap coverage

Vladimir Sementsov-Ogievskiy (6):
  migration: refactor init_dirty_bitmap_migration
  block/dirty-bitmap: add bdrv_has_named_bitmaps helper
  migration: fix bitmaps pre-blockdev migration with mirror job
  iotests: 194: test also migration of dirty bitmap
  migration: add_bitmaps_to_list: check disk name once
  migration: forbid bitmap migration by generated node-name

 docs/tools/qemu-img.rst  |  13 +++-
 qapi/block-core.json |  16 +++--
 block/qcow2.h|   2 +
 include/block/dirty-bitmap.h |   1 +
 block/crypto.c   |   2 +-
 block/dirty-bitmap.c |  13 
 block/qcow2-bitmap.c |  36 ++
 block/qcow2.c|  14 +++-
 block/raw-format.c   |   2 +-
 migration/block-dirty-bitmap.c   | 142 ---
 qemu-img.c   | 107 -
 qemu-img-cmds.hx |   4 +-
 tests/qemu-iotests/178.out.qcow2 |  18 -
 tests/qemu-iotests/178.out.raw   |   2 +-
 tests/qemu-iotests/190   |  47 -
 tests/qemu-iotests/190.out   |  27 +++-
 tests/qemu-iotests/194   |  14 ++--
 tests/qemu-iotests/194.out   |   6 ++
 tests/qemu-iotests/291   | 112 ++
 tests/qemu-iotests/291.out   |  80 ++
 tests/qemu-iotests/group |   1 +
 21 files changed, 582 insertions(+), 77 deletions(-)
 create mode 100755 tests/qemu-iotests/291
 create mode 100644 tests/qemu-iotests/291.out

-- 
2.26.2

[PULL v2 08/11] qcow2: Expose bitmaps' size during measure

2020-05-27 Thread Eric Blake

It's useful to know how much space can be occupied by qcow2 persistent
bitmaps, even though such metadata is unrelated to the guest-visible
data.  Report this value as an additional QMP field, present when
measuring an existing image and output format that both support
bitmaps.  Update iotest 178 and 190 to updated output, as well as new
coverage in 190 demonstrating non-zero values made possible with the
recently-added qemu-img bitmap command (see 3b51ab4b).

The new 'bitmaps size:' field is displayed automatically as part of
'qemu-img measure' any time it is present in QMP (that is, any time
both the source image being measured and destination format support
bitmaps, even if the measurement is 0 because there are no bitmaps
present).  If the field is absent, it means that no bitmaps can be
copied (source, destination, or both lack bitmaps, including when
measuring based on size rather than on a source image).  This behavior
is compatible with an upcoming patch adding 'qemu-img convert
--bitmaps': that command will fail in the same situations where this
patch omits the field.

The addition of a new field demonstrates why we should always
zero-initialize qapi C structs; while the qcow2 driver still fully
populates all fields, the raw and crypto drivers had to be tweaked to
avoid uninitialized data.

Consideration was also given towards having a 'qemu-img measure
--bitmaps' which errors out when bitmaps are not possible, and
otherwise sums the bitmaps into the existing allocation totals rather
than displaying as a separate field, as a potential convenience
factor.  But this was ultimately decided to be more complexity than
necessary when the QMP interface was sufficient enough with bitmaps
remaining a separate field.

See also: https://bugzilla.redhat.com/1779904

Reported-by: Nir Soffer 
Signed-off-by: Eric Blake 
Message-Id: <20200521192137.1120211-3-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 docs/tools/qemu-img.rst  |  7 +
 qapi/block-core.json | 16 +++
 block/qcow2.h|  2 ++
 block/crypto.c   |  2 +-
 block/qcow2-bitmap.c | 36 
 block/qcow2.c| 14 --
 block/raw-format.c   |  2 +-
 qemu-img.c   |  3 ++
 tests/qemu-iotests/178.out.qcow2 | 16 +++
 tests/qemu-iotests/190   | 47 ++--
 tests/qemu-iotests/190.out   | 27 +-
 11 files changed, 159 insertions(+), 13 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 38d464ea3f23..320cb52b9f61 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -616,6 +616,7 @@ Command description:

 required size: 524288
 fully allocated size: 1074069504
+bitmaps size: 0

   The ``required size`` is the file size of the new image.  It may be smaller
   than the virtual disk size if the image format supports compact 
representation.
@@ -625,6 +626,12 @@ Command description:
   occupy with the exception of internal snapshots, dirty bitmaps, vmstate data,
   and other advanced image format features.

+  The ``bitmaps size`` is the additional size required in order to
+  copy bitmaps from a source image in addition to the guest-visible
+  data; the line is omitted if either source or destination lacks
+  bitmap support, or 0 if bitmaps are supported but there is nothing
+  to copy.
+
 .. option:: snapshot [--object OBJECTDEF] [--image-opts] [-U] [-q] [-l | -a 
SNAPSHOT | -c SNAPSHOT | -d SNAPSHOT] FILENAME

   List, apply, create or delete snapshots in image *FILENAME*.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6fbacddab2cc..0e1c6a59f228 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -636,18 +636,24 @@
 # efficiently so file size may be smaller than virtual disk size.
 #
 # The values are upper bounds that are guaranteed to fit the new image file.
-# Subsequent modification, such as internal snapshot or bitmap creation, may
-# require additional space and is not covered here.
+# Subsequent modification, such as internal snapshot or further bitmap
+# creation, may require additional space and is not covered here.
 #
-# @required: Size required for a new image file, in bytes.
+# @required: Size required for a new image file, in bytes, when copying just
+#allocated guest-visible contents.
 #
 # @fully-allocated: Image file size, in bytes, once data has been written
-#   to all sectors.
+#   to all sectors, when copying just guest-visible contents.
+#
+# @bitmaps: Additional size required if all the top-level bitmap metadata
+#   in the source image were to be copied to the destination,
+#   present only when source and destination both support
+#   persistent bitmaps. (since 5.1)
 #
 # Since: 2.10
 ##
 { 'struct': 'BlockMeasureInfo',
-  'data': {'required': 'int', 'fully-alloc

Re: [PATCH v6 4/5] block: make size-related BlockConf properties accept size suffixes

2020-05-27 Thread Eric Blake


On 5/27/20 3:53 PM, Roman Kagan wrote:


---
v5 -> v6:
- add prop_size32 instead of going with 64bit


Would it be worth adding prop_size32 as its own patch, before using it here?


I've no strong opinion on this.  Should I better split it out when
respinning?


Patch splitting is an art-form.  But in general, a long series of 
smaller patches each easy to review is going to get accepted into the 
tree faster than a single patch that merges multiple changes into one 
big blob, even if the net diff is identical.  It's rare that someone 
will ask you to merge patches because you split too far, so the real 
tradeoff is whether it will cost you more time to split than what you 
will save the next reviewer (including the maintainer that will merge 
your patches, depending on whether the maintainer also reviews it or 
just trusts my review), if you decide to go with a v7.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Fully operable build post

2020-05-27 Thread Chris Hoy

Hello all,

I am proud to see that I have my barebones implementation of qemu finally
working. After starting earlier this year, I have slowly made progress to
fully integrate my kernel hardware into a gpu-passthrough vm. I went
through many settings and templates from various sources and finally have
something that works for me for a start and would be really interested in
lining out my experiences. It is my observation that the documentation may
be a bit unevenly distributed on the official page as far as templates go
and so if I could help I would like to send my template through the
official forum. Any help or advice with this would be greatly appreciated.

Best Regards

Re: [PATCH v6 4/5] block: make size-related BlockConf properties accept size suffixes

2020-05-27 Thread Roman Kagan

On Wed, May 27, 2020 at 09:50:39AM -0500, Eric Blake wrote:
> On 5/27/20 7:45 AM, Roman Kagan wrote:
> > Several BlockConf properties represent respective sizes in bytes so it
> > makes sense to accept size suffixes for them.
> > 
> > Turn them all into uint32_t and use size-suffix-capable setters/getters
> > on them; introduce DEFINE_PROP_SIZE32 and adjust DEFINE_PROP_BLOCKSIZE
> > for that. (Making them uint64_t and reusing DEFINE_PROP_SIZE isn't
> > justified because guests expect at most 32bit values.)
> > 
> > Also, since min_io_size is exposed to the guest by scsi and virtio-blk
> > devices as an uint16_t in units of logical blocks, introduce an
> > additional check in blkconf_blocksizes to prevent its silent truncation.
> > 
> > Signed-off-by: Roman Kagan 
> > ---
> > v5 -> v6:
> > - add prop_size32 instead of going with 64bit
> 
> Would it be worth adding prop_size32 as its own patch, before using it here?

I've no strong opinion on this.  Should I better split it out when
respinning?

> But I'll review this as-is.
> 
> > +++ b/hw/block/block.c
> > @@ -96,6 +96,17 @@ bool blkconf_blocksizes(BlockConf *conf, Error **errp)
> >   return false;
> >   }
> > +/*
> > + * all devices which support min_io_size (scsi and virtio-blk) expose 
> > it to
> > + * the guest as a uint16_t in units of logical blocks
> > + */
> > +if (conf->min_io_size > conf->logical_block_size * UINT16_MAX) {
> 
> This risks overflow.  Better would be:
> 
> if (conf->min_io_size / conf->logical_block-size > UINT16_MAX)

D'oh.  I kept it in mind and did it right initially in v4, but then
mixed it up.  Thanks for catching!

> 
> > +error_setg(errp,
> > +   "min_io_size must not exceed " stringify(UINT16_MAX)
> > +   " logical blocks");
> > +return false;
> > +}
> > +
> >   if (!QEMU_IS_ALIGNED(conf->opt_io_size, conf->logical_block_size)) {
> >   error_setg(errp,
> >  "opt_io_size must be a multiple of 
> > logical_block_size");
> 
> > +++ b/tests/qemu-iotests/172.out
> > @@ -24,11 +24,11 @@ Testing:
> > dev: floppy, id ""
> >   unit = 0 (0x0)
> >   drive = "floppy0"
> > -logical_block_size = 512 (0x200)
> > -physical_block_size = 512 (0x200)
> > -min_io_size = 0 (0x0)
> > -opt_io_size = 0 (0x0)
> > -discard_granularity = 4294967295 (0x)
> > +logical_block_size = 512 (512 B)
> > +physical_block_size = 512 (512 B)
> > +min_io_size = 0 (0 B)
> > +opt_io_size = 0 (0 B)
> > +discard_granularity = 4294967295 (4 GiB)
> 
> Although 4 GiB is not quite the same as 4294967295, the exact byte value
> next to the approximate size is not too bad.  The mechanical fallout from
> the change from int to size is fine to me.

Thanks,
Roman.

[PATCH v5 7/7] block/io: refactor save/load vmstate

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

Like for read/write in a previous commit, drop extra indirection layer,
generate directly bdrv_readv_vmstate() and bdrv_writev_vmstate().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h| 10 +++
 include/block/block.h |  6 ++--
 block/io.c| 67 ++-
 3 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index 56c0be6f8f..7e14fdeccf 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -58,11 +58,9 @@ bdrv_common_block_status_above(BlockDriverState *bs,
int64_t *map,
BlockDriverState **file);
 
-int coroutine_fn
-bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-   bool is_read);
-int generated_co_wrapper
-bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-bool is_read);
+int coroutine_fn bdrv_co_readv_vmstate(BlockDriverState *bs,
+   QEMUIOVector *qiov, int64_t pos);
+int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs,
+QEMUIOVector *qiov, int64_t pos);
 
 #endif /* BLOCK_COROUTINES_INT_H */
diff --git a/include/block/block.h b/include/block/block.h
index dadb6e796f..cc2edbeb65 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -568,8 +568,10 @@ int path_has_protocol(const char *path);
 int path_is_absolute(const char *path);
 char *path_combine(const char *base_path, const char *filename);
 
-int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
-int bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+int generated_co_wrapper
+bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
+int generated_co_wrapper
+bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
 int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
   int64_t pos, int size);
 
diff --git a/block/io.c b/block/io.c
index 305ee7219a..8c1da3c335 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2491,66 +2491,67 @@ int bdrv_is_allocated_above(BlockDriverState *top,
 }
 
 int coroutine_fn
-bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
-   bool is_read)
+bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
 {
 BlockDriver *drv = bs->drv;
 int ret = -ENOTSUP;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 bdrv_inc_in_flight(bs);
 
-if (!drv) {
-ret = -ENOMEDIUM;
-} else if (drv->bdrv_load_vmstate) {
-if (is_read) {
-ret = drv->bdrv_load_vmstate(bs, qiov, pos);
-} else {
-ret = drv->bdrv_save_vmstate(bs, qiov, pos);
-}
+if (drv->bdrv_load_vmstate) {
+ret = drv->bdrv_load_vmstate(bs, qiov, pos);
 } else if (bs->file) {
-ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
+ret = bdrv_co_readv_vmstate(bs->file->bs, qiov, pos);
 }
 
 bdrv_dec_in_flight(bs);
+
 return ret;
 }
 
-int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
-  int64_t pos, int size)
+int coroutine_fn
+bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
 {
-QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
-int ret;
+BlockDriver *drv = bs->drv;
+int ret = -ENOTSUP;
 
-ret = bdrv_writev_vmstate(bs, &qiov, pos);
-if (ret < 0) {
-return ret;
+if (!drv) {
+return -ENOMEDIUM;
 }
 
-return size;
-}
+bdrv_inc_in_flight(bs);
 
-int bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
-{
-return bdrv_rw_vmstate(bs, qiov, pos, false);
+if (drv->bdrv_load_vmstate) {
+ret = drv->bdrv_save_vmstate(bs, qiov, pos);
+} else if (bs->file) {
+ret = bdrv_co_writev_vmstate(bs->file->bs, qiov, pos);
+}
+
+bdrv_dec_in_flight(bs);
+
+return ret;
 }
 
-int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
+int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
   int64_t pos, int size)
 {
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
-int ret;
-
-ret = bdrv_readv_vmstate(bs, &qiov, pos);
-if (ret < 0) {
-return ret;
-}
+int ret = bdrv_writev_vmstate(bs, &qiov, pos);
 
-return size;
+return ret < 0 ? ret : size;
 }
 
-int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
+int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
+  int64_t pos, int size)
 {
-return bdrv_rw_vmstate(bs, qiov, pos, true);
+QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, size);
+int ret = bdrv_readv_vmstate(bs, &qiov, pos);
+
+return ret < 0 ? ret : size;
 }
 
 /

[PATCH v5 3/7] block: declare some coroutine functions in block/coroutines.h

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

We are going to keep coroutine-wrappers code (structure-packing
parameters, BDRV_POLL wrapper functions) in separate auto-generated
files. So, we'll need a header with declaration of original _co_
functions, for those which are static now. As well, we'll need
declarations for wrapper functions. Do these declarations now, as a
preparation step.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h | 67 ++
 block.c|  8 +++---
 block/io.c | 34 +++
 3 files changed, 88 insertions(+), 21 deletions(-)
 create mode 100644 block/coroutines.h

diff --git a/block/coroutines.h b/block/coroutines.h
new file mode 100644
index 00..9ce1730a09
--- /dev/null
+++ b/block/coroutines.h
@@ -0,0 +1,67 @@
+/*
+ * Block layer I/O functions
+ *
+ * Copyright (c) 2003 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_COROUTINES_INT_H
+#define BLOCK_COROUTINES_INT_H
+
+#include "block/block_int.h"
+
+int coroutine_fn bdrv_co_check(BlockDriverState *bs,
+   BdrvCheckResult *res, BdrvCheckMode fix);
+int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
+
+int coroutine_fn
+bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
+ bool is_write, BdrvRequestFlags flags);
+int
+bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
+  bool is_write, BdrvRequestFlags flags);
+
+int coroutine_fn
+bdrv_co_common_block_status_above(BlockDriverState *bs,
+  BlockDriverState *base,
+  bool want_zero,
+  int64_t offset,
+  int64_t bytes,
+  int64_t *pnum,
+  int64_t *map,
+  BlockDriverState **file);
+int
+bdrv_common_block_status_above(BlockDriverState *bs,
+   BlockDriverState *base,
+   bool want_zero,
+   int64_t offset,
+   int64_t bytes,
+   int64_t *pnum,
+   int64_t *map,
+   BlockDriverState **file);
+
+int coroutine_fn
+bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
+   bool is_read);
+int
+bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
+bool is_read);
+
+#endif /* BLOCK_COROUTINES_INT_H */
diff --git a/block.c b/block.c
index b01551f21c..2ca9267729 100644
--- a/block.c
+++ b/block.c
@@ -48,6 +48,7 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
 #include 
@@ -4625,8 +4626,8 @@ static void bdrv_delete(BlockDriverState *bs)
  * free of errors) or -errno when an internal error occurred. The results of 
the
  * check are stored in res.
  */
-static int coroutine_fn bdrv_co_check(BlockDriverState *bs,
-  BdrvCheckResult *res, BdrvCheckMode fix)
+int coroutine_fn bdrv_co_check(BlockDriverState *bs,
+   BdrvCheckResult *res, BdrvCheckMode fix)
 {
 if (bs->drv == NULL) {
 return -ENOMEDIUM;
@@ -5643,8 +5644,7 @@ void bdrv_init_with_whitelist(void)
 bdrv_init();
 }
 
-static int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
- Error **errp)
+int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
 {
 BdrvChild *child, *parent;
 uint64_t perm, shared_perm;
diff --git a/block/io.c b/block/io.c
index bd00a70b47..f5b6ce3bf6 100644
--- a/block/io.c
+++ b/block/io.c
@@ -29,6 +29,7 @@
 #include "block/blockjob.h"
 #include "b

[PATCH v5 6/7] block: drop bdrv_prwv

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

Now that we are not maintaining boilerplate code for coroutine
wrappers, there is no more sense in keeping the extra indirection layer
of bdrv_prwv().  Let's drop it and instead generate pure bdrv_preadv()
and bdrv_pwritev().

Currently, bdrv_pwritev() and bdrv_preadv() are returning bytes on
success, auto generated functions will instead return zero, as their
_co_ prototype. Still, it's simple to make the conversion safe: the
only external user of bdrv_pwritev() is test-bdrv-drain, and it is
comfortable enough with bdrv_co_pwritev() instead. So prototypes are
moved to local block/coroutines.h. Next, the only internal use is
bdrv_pread() and bdrv_pwrite(), which are modified to return bytes on
success.

Of course, it would be great to convert bdrv_pread() and bdrv_pwrite()
to return 0 on success. But this requires audit (and probably
conversion) of all their users, let's leave it for another day
refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/coroutines.h  | 10 -
 include/block/block.h   |  2 --
 block/io.c  | 49 -
 tests/test-bdrv-drain.c |  2 +-
 4 files changed, 15 insertions(+), 48 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index 145a2d2645..56c0be6f8f 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -32,12 +32,12 @@ int coroutine_fn bdrv_co_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix);
 int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp);
 
-int coroutine_fn
-bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
- bool is_write, BdrvRequestFlags flags);
 int generated_co_wrapper
-bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
-  bool is_write, BdrvRequestFlags flags);
+bdrv_preadv(BdrvChild *child, int64_t offset, unsigned int bytes,
+QEMUIOVector *qiov, BdrvRequestFlags flags);
+int generated_co_wrapper
+bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags);
 
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index 46c8b7816e..dadb6e796f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -379,9 +379,7 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int bytes, BdrvRequestFlags flags);
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
 int bdrv_pread(BdrvChild *child, int64_t offset, void *buf, int bytes);
-int bdrv_preadv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
 int bdrv_pwrite(BdrvChild *child, int64_t offset, const void *buf, int bytes);
-int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
 int bdrv_pwrite_sync(BdrvChild *child, int64_t offset,
  const void *buf, int count);
 /*
diff --git a/block/io.c b/block/io.c
index f9700cc897..305ee7219a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -892,23 +892,11 @@ static int bdrv_check_byte_request(BlockDriverState *bs, 
int64_t offset,
 return 0;
 }
 
-int coroutine_fn bdrv_co_prwv(BdrvChild *child, int64_t offset,
-  QEMUIOVector *qiov, bool is_write,
-  BdrvRequestFlags flags)
-{
-if (is_write) {
-return bdrv_co_pwritev(child, offset, qiov->size, qiov, flags);
-} else {
-return bdrv_co_preadv(child, offset, qiov->size, qiov, flags);
-}
-}
-
 int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int bytes, BdrvRequestFlags flags)
 {
-QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, bytes);
-
-return bdrv_prwv(child, offset, &qiov, true, BDRV_REQ_ZERO_WRITE | flags);
+return bdrv_pwritev(child, offset, bytes, NULL,
+BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /*
@@ -952,41 +940,19 @@ int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags 
flags)
 }
 }
 
-/* return < 0 if error. See bdrv_pwrite() for the return codes */
-int bdrv_preadv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov)
-{
-int ret;
-
-ret = bdrv_prwv(child, offset, qiov, false, 0);
-if (ret < 0) {
-return ret;
-}
-
-return qiov->size;
-}
-
 /* See bdrv_pwrite() for the return codes */
 int bdrv_pread(BdrvChild *child, int64_t offset, void *buf, int bytes)
 {
+int ret;
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, buf, bytes);
 
 if (bytes < 0) {
 return -EINVAL;
 }
 
-return bdrv_preadv(child, offset, &qiov);
-}
-
-int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov)
-{
-int ret;
+ret = bdrv_preadv(child, offset, bytes, &qiov,  0);
 
-ret = bdrv_prwv(child, offset, qiov, true, 0);
-if (ret < 0) {
-return ret;
-}
-
-return qiov->size;
+return ret < 0 ? ret : bytes;
 }
 
 /* Return

[PATCH v5 2/7] block/io: refactor coroutine wrappers

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

Most of our coroutine wrappers already follow this convention:

We have 'coroutine_fn bdrv_co_()' as
the core function, and a wrapper 'bdrv_()' which does a polling loop.

The only outsiders are the bdrv_prwv_co and
bdrv_common_block_status_above wrappers. Let's refactor them to behave
as the others, it simplifies further conversion of coroutine wrappers.

This patch adds indirection layer, but it will be compensated by
further commit, which will drop bdrv_co_prwv together with is_write
logic, to keep read and write patch separate.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/io.c | 61 +-
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/block/io.c b/block/io.c
index 121ce17a49..bd00a70b47 100644
--- a/block/io.c
+++ b/block/io.c
@@ -900,28 +900,32 @@ typedef struct RwCo {
 BdrvRequestFlags flags;
 } RwCo;
 
+static int coroutine_fn bdrv_co_prwv(BdrvChild *child, int64_t offset,
+ QEMUIOVector *qiov, bool is_write,
+ BdrvRequestFlags flags)
+{
+if (is_write) {
+return bdrv_co_pwritev(child, offset, qiov->size, qiov, flags);
+} else {
+return bdrv_co_preadv(child, offset, qiov->size, qiov, flags);
+}
+}
+
 static void coroutine_fn bdrv_rw_co_entry(void *opaque)
 {
 RwCo *rwco = opaque;
 
-if (!rwco->is_write) {
-rwco->ret = bdrv_co_preadv(rwco->child, rwco->offset,
-   rwco->qiov->size, rwco->qiov,
-   rwco->flags);
-} else {
-rwco->ret = bdrv_co_pwritev(rwco->child, rwco->offset,
-rwco->qiov->size, rwco->qiov,
-rwco->flags);
-}
+rwco->ret = bdrv_co_prwv(rwco->child, rwco->offset, rwco->qiov,
+ rwco->is_write, rwco->flags);
 aio_wait_kick();
 }
 
 /*
  * Process a vectored synchronous request using coroutines
  */
-static int bdrv_prwv_co(BdrvChild *child, int64_t offset,
-QEMUIOVector *qiov, bool is_write,
-BdrvRequestFlags flags)
+static int bdrv_prwv(BdrvChild *child, int64_t offset,
+ QEMUIOVector *qiov, bool is_write,
+ BdrvRequestFlags flags)
 {
 Coroutine *co;
 RwCo rwco = {
@@ -949,8 +953,7 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
 {
 QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, bytes);
 
-return bdrv_prwv_co(child, offset, &qiov, true,
-BDRV_REQ_ZERO_WRITE | flags);
+return bdrv_prwv(child, offset, &qiov, true, BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /*
@@ -999,7 +1002,7 @@ int bdrv_preadv(BdrvChild *child, int64_t offset, 
QEMUIOVector *qiov)
 {
 int ret;
 
-ret = bdrv_prwv_co(child, offset, qiov, false, 0);
+ret = bdrv_prwv(child, offset, qiov, false, 0);
 if (ret < 0) {
 return ret;
 }
@@ -1023,7 +1026,7 @@ int bdrv_pwritev(BdrvChild *child, int64_t offset, 
QEMUIOVector *qiov)
 {
 int ret;
 
-ret = bdrv_prwv_co(child, offset, qiov, true, 0);
+ret = bdrv_prwv(child, offset, qiov, true, 0);
 if (ret < 0) {
 return ret;
 }
@@ -2443,14 +2446,15 @@ early_out:
 return ret;
 }
 
-static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
-   BlockDriverState *base,
-   bool want_zero,
-   int64_t offset,
-   int64_t bytes,
-   int64_t *pnum,
-   int64_t *map,
-   BlockDriverState **file)
+static int coroutine_fn
+bdrv_co_common_block_status_above(BlockDriverState *bs,
+  BlockDriverState *base,
+  bool want_zero,
+  int64_t offset,
+  int64_t bytes,
+  int64_t *pnum,
+  int64_t *map,
+  BlockDriverState **file)
 {
 BlockDriverState *p;
 int ret = 0;
@@ -2488,10 +2492,11 @@ static void coroutine_fn 
bdrv_block_status_above_co_entry(void *opaque)
 {
 BdrvCoBlockStatusData *data = opaque;
 
-data->ret = bdrv_co_block_status_above(data->bs, data->base,
-   data->want_zero,
-   data->offset, data->bytes,
-   data->pnum, data->map, data->file);
+data->ret = bdrv_co_common_block_status_above(data->bs, data->base,
+  data->want_zero,
+

[PATCH v5 4/7] scripts: add coroutine-wrapper.py

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

We have a very frequent pattern of creating coroutine from function
with several arguments:

  - create structure to pack parameters
  - create _entry function to call original function taking parameters
from struct
  - do different magic to handle completion: set ret to NOT_DONE or
EINPROGRESS, use separate bool for void functions
  - fill the struct and create coroutine from _entry function and this
struct as a parameter
  - do coroutine enter and BDRV_POLL_WHILE loop

Let's reduce code duplication by generating coroutine wrappers.

This patch adds scripts/coroutine-wrapper.py together with some
friends, which will generate functions with declared prototypes marked
by 'generated_co_wrapper' specifier.

The usage of new code generation is as follows:

1. define somewhere

int coroutine_fn bdrv_co_NAME(...) {...}

   function

2. declare in some header file

int generated_co_wrapper bdrv_NAME(...);

   function with same list of parameters. (you'll need to include
   "block/generated-co-wrapper.h" to get the specifier)

3. both declarations should be available through block/coroutines.h
   header.

4. add header with generated_co_wrapper declaration into
   COROUTINE_HEADERS list in Makefile

Still, no function is now marked, this work is for the following
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 Makefile |   8 ++
 block/block-gen.h|  55 +
 include/block/generated-co-wrapper.h |  35 ++
 block/Makefile.objs  |   1 +
 scripts/coroutine-wrapper.py | 173 +++
 5 files changed, 272 insertions(+)
 create mode 100644 block/block-gen.h
 create mode 100644 include/block/generated-co-wrapper.h
 create mode 100755 scripts/coroutine-wrapper.py

diff --git a/Makefile b/Makefile
index 40e4f7677b..09e9b1875c 100644
--- a/Makefile
+++ b/Makefile
@@ -175,6 +175,14 @@ generated-files-y += $(TRACE_SOURCES)
 generated-files-y += $(BUILD_DIR)/trace-events-all
 generated-files-y += .git-submodule-status
 
+generated-files-y += block/block-gen.c
+
+COROUTINE_HEADERS = include/block/block.h block/coroutines.h
+block/block-gen.c: $(COROUTINE_HEADERS) scripts/coroutine-wrapper.py
+   $(call quiet-command, \
+   cat $(addprefix $(SRC_PATH)/,$(COROUTINE_HEADERS)) | \
+   $(SRC_PATH)/scripts/coroutine-wrapper.py > 
$@,"GEN","$(TARGET_DIR)$@")
+
 trace-group-name = $(shell dirname $1 | sed -e 's/[^a-zA-Z0-9]/_/g')
 
 tracetool-y = $(SRC_PATH)/scripts/tracetool.py
diff --git a/block/block-gen.h b/block/block-gen.h
new file mode 100644
index 00..482d06737d
--- /dev/null
+++ b/block/block-gen.h
@@ -0,0 +1,55 @@
+/*
+ * Block coroutine wrapping core, used by auto-generated block/block-gen.c
+ *
+ * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2020 Virtuozzo International GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_BLOCK_GEN_H
+#define BLOCK_BLOCK_GEN_H
+
+#include "block/block_int.h"
+
+/* This function is called at the end of generated coroutine entries. */
+static inline void bdrv_poll_co__on_exit(void)
+{
+aio_wait_kick();
+}
+
+/* Base structure for argument packing structures */
+typedef struct BdrvPollCo {
+BlockDriverState *bs;
+bool in_progress;
+int ret;
+Coroutine *co; /* Keep pointer here for debugging */
+} BdrvPollCo;
+
+static inline int bdrv_poll_co(BdrvPollCo *s)
+{
+assert(!qemu_in_coroutine());
+
+bdrv_coroutine_enter(s->bs, s->co);
+BDRV_POLL_WHILE(s->bs, s->in_progress);
+
+return s->ret;
+}
+
+#endif /* BLOCK_BLOCK_GEN_H */
diff --git a/include/block/generated-co-wrapper.h 
b/include/block/generated-co-wrapper.h
new file mode 100644
index 00..62c6e053ba
--- /dev/null
+++ b/include/block/generated-co-wrapper.h
@@ -0,0 +1,35 @@
+/*
+ * Block layer I/O functions
+ *
+ * Copyright (c) 2020 Virt

[PATCH v5 0/7] coroutines: generate wrapper code

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

Hi all!

The aim of the series is to reduce code-duplication and writing
parameters structure-packing by hand around coroutine function wrappers.

It's an alternative to "[PATCH v3] block: Factor out bdrv_run_co()"
patch.

Benefits:
 - no code duplication
 - less indirection

v5: mostly by Eric's suggestions:
01: new
02: tweak commit message
03: - fix type in commit message
- rebase on 01
- keep Eric's r-b
04: - conversion splitted to 05
- script mostly rewritten
  - use f-strings for templating
  - add copyright to generated file header
  - wrap long lines if clang-format available
- fix makefiles
05: splitted from 04 mechanical conversion
06: tweak commit message, add Eric's r-b
07: add Eric's r-b

For convenience I attach generated block/block-gen.c file below.

Vladimir Sementsov-Ogievskiy (7):
  block: return error-code from bdrv_invalidate_cache
  block/io: refactor coroutine wrappers
  block: declare some coroutine functions in block/coroutines.h
  scripts: add coroutine-wrapper.py
  block: generate coroutine-wrapper code
  block: drop bdrv_prwv
  block/io: refactor save/load vmstate

 Makefile |   8 +
 block/block-gen.h|  55 
 block/coroutines.h   |  66 +
 include/block/block.h|  25 +-
 include/block/generated-co-wrapper.h |  35 +++
 block.c  |  97 +--
 block/io.c   | 383 ---
 tests/test-bdrv-drain.c  |   2 +-
 block/Makefile.objs  |   1 +
 scripts/coroutine-wrapper.py | 173 
 10 files changed, 417 insertions(+), 428 deletions(-)
 create mode 100644 block/block-gen.h
 create mode 100644 block/coroutines.h
 create mode 100644 include/block/generated-co-wrapper.h
 create mode 100755 scripts/coroutine-wrapper.py


= Generated block/block-gen.c ==
/*
 * File is generated by scripts/coroutine-wrapper.py
 *
 * Copyright (c) 2020 Virtuozzo International GmbH.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see .
 */

#include "qemu/osdep.h"
#include "block/coroutines.h"
#include "block/block-gen.h"


/*
 * Wrappers for bdrv_co_truncate
 */

typedef struct BdrvCoTruncate {
BdrvPollCo poll_state;
BdrvChild *child;
int64_t offset;
bool exact;
PreallocMode prealloc;
BdrvRequestFlags flags;
Error **errp;
} BdrvCoTruncate;

static void coroutine_fn bdrv_co_truncate_entry(void *opaque)
{
BdrvCoTruncate *s = opaque;

s->poll_state.ret = bdrv_co_truncate(s->child, s->offset, s->exact,
 s->prealloc, s->flags, s->errp);
s->poll_state.in_progress = false;

bdrv_poll_co__on_exit();
}

int bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
  PreallocMode prealloc, BdrvRequestFlags flags, Error **errp)
{
if (qemu_in_coroutine()) {
return bdrv_co_truncate(child, offset, exact, prealloc, flags, errp);
} else {
BdrvCoTruncate s = {
.poll_state.bs = child->bs,
.poll_state.in_progress = true,

.child = child,
.offset = offset,
.exact = exact,
.prealloc = prealloc,
.flags = flags,
.errp = errp,
};

s.poll_state.co = qemu_coroutine_create(bdrv_co_truncate_entry, &s);

return bdrv_poll_co(&s.poll_state);
}
}


/*
 * Wrappers for bdrv_co_check
 */

typedef struct BdrvCoCheck {
BdrvPollCo poll_state;
BlockDriverState *bs;
BdrvCheckResult *res;
BdrvCheckMode fix;
} BdrvCoCheck;

static void coroutine_fn bdrv_co_check_entry(void *opaque)
{
BdrvCoCheck *s = opaque;

s->poll_state.ret = bdrv_co_check(s->bs, s->res, s->fix);
s->poll_state.in_progress = false;

bdrv_poll_co__on_exit();
}

int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix)
{
if (qemu_in_coroutine()) {
return bdrv_co_check(bs, res, fix);
} else {
BdrvCoCheck s = {
.poll_state.bs = bs,
.poll_state.in_progress = true,

.bs = bs,
.res = res,
.fix = fix,
};

s.poll_state.co = qemu_coroutine_create(bdrv_co_check_entry, &s);

return bdrv_poll_co(&s.poll_state);
}
}


/*
 * Wrappers for bdrv_co_invalidate_cache
 */

typedef str

[PATCH v5 5/7] block: generate coroutine-wrapper code

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

Use code generation implemented in previous commit to generated
coroutine wrappers in block.c and block/io.c

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/coroutines.h|   7 +-
 include/block/block.h |  17 ++-
 block.c   |  73 
 block/io.c| 260 --
 4 files changed, 15 insertions(+), 342 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index 9ce1730a09..145a2d2645 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -26,6 +26,7 @@
 #define BLOCK_COROUTINES_INT_H
 
 #include "block/block_int.h"
+#include "block/generated-co-wrapper.h"
 
 int coroutine_fn bdrv_co_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix);
@@ -34,7 +35,7 @@ int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState 
*bs, Error **errp);
 int coroutine_fn
 bdrv_co_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
  bool is_write, BdrvRequestFlags flags);
-int
+int generated_co_wrapper
 bdrv_prwv(BdrvChild *child, int64_t offset, QEMUIOVector *qiov,
   bool is_write, BdrvRequestFlags flags);
 
@@ -47,7 +48,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
   int64_t *pnum,
   int64_t *map,
   BlockDriverState **file);
-int
+int generated_co_wrapper
 bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool want_zero,
@@ -60,7 +61,7 @@ bdrv_common_block_status_above(BlockDriverState *bs,
 int coroutine_fn
 bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
bool is_read);
-int
+int generated_co_wrapper
 bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
 bool is_read);
 
diff --git a/include/block/block.h b/include/block/block.h
index 46965a7780..46c8b7816e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -9,6 +9,7 @@
 #include "block/dirty-bitmap.h"
 #include "block/blockjob.h"
 #include "qemu/hbitmap.h"
+#include "block/generated-co-wrapper.h"
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
@@ -398,8 +399,9 @@ void bdrv_refresh_filename(BlockDriverState *bs);
 int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
   PreallocMode prealloc, BdrvRequestFlags 
flags,
   Error **errp);
-int bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
-  PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
+int generated_co_wrapper
+bdrv_truncate(BdrvChild *child, int64_t offset, bool exact,
+  PreallocMode prealloc, BdrvRequestFlags flags, Error **errp);
 
 int64_t bdrv_nb_sectors(BlockDriverState *bs);
 int64_t bdrv_getlength(BlockDriverState *bs);
@@ -441,7 +443,8 @@ typedef enum {
 BDRV_FIX_ERRORS   = 2,
 } BdrvCheckMode;
 
-int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
+int generated_co_wrapper bdrv_check(BlockDriverState *bs, BdrvCheckResult *res,
+BdrvCheckMode fix);
 
 /* The units of offset and total_work_size may be chosen arbitrarily by the
  * block driver; total_work_size may change during the course of the amendment
@@ -464,12 +467,13 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
 
 /* Invalidate any cached metadata used by image formats */
-int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
+int generated_co_wrapper bdrv_invalidate_cache(BlockDriverState *bs,
+   Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
 /* Ensure contents are flushed to disk.  */
-int bdrv_flush(BlockDriverState *bs);
+int generated_co_wrapper bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
 int bdrv_flush_all(void);
 void bdrv_close_all(void);
@@ -484,7 +488,8 @@ void bdrv_drain_all(void);
 AIO_WAIT_WHILE(bdrv_get_aio_context(bs_),  \
cond); })
 
-int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
+int generated_co_wrapper bdrv_pdiscard(BdrvChild *child, int64_t offset,
+   int64_t bytes);
 int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
diff --git a/block.c b/block.c
index 2ca9267729..3046696f30 100644
--- a/block.c
+++ b/block.c
@@ -4640,43 +4640,6 @@ int coroutine_fn bdrv_co_check(BlockDriverState *bs,
 return bs->drv->bdrv_co_check(bs, res, fix);
 }
 
-typedef struct CheckCo {
-BlockDriverState *bs;
-BdrvCheckResult *res;
-BdrvCheckMode fix;
-

[PATCH v5 1/7] block: return error-code from bdrv_invalidate_cache

2020-05-27 Thread Vladimir Sementsov-Ogievskiy

This is the only coroutine wrapper from block.c and block/io.c which
doesn't return value, so let's convert it to the common behavior, to
simplify moving to generated coroutine wrappers in further commit.

Also, bdrv_invalidate_cache is a void function, returning error only
through **errp parameter, which considered to be bad practice, as it
forces callers to define and propagate local_err variable, so
conversion is good anyway.

This patch leaves convertion of .bdrv_co_invalidate_cache() driver
callbacks and bdrv_invalidate_cache_all() for another-day refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block.h |  2 +-
 block.c   | 32 ++--
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 25e299605e..46965a7780 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -464,7 +464,7 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
 
 /* Invalidate any cached metadata used by image formats */
-void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
+int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
diff --git a/block.c b/block.c
index 8416376c9b..b01551f21c 100644
--- a/block.c
+++ b/block.c
@@ -5643,8 +5643,8 @@ void bdrv_init_with_whitelist(void)
 bdrv_init();
 }
 
-static void coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
-  Error **errp)
+static int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
+ Error **errp)
 {
 BdrvChild *child, *parent;
 uint64_t perm, shared_perm;
@@ -5653,14 +5653,14 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 BdrvDirtyBitmap *bm;
 
 if (!bs->drv)  {
-return;
+return -ENOMEDIUM;
 }
 
 QLIST_FOREACH(child, &bs->children, next) {
 bdrv_co_invalidate_cache(child->bs, &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 
@@ -5684,7 +5684,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (ret < 0) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return ret;
 }
 bdrv_set_perm(bs, perm, shared_perm);
 
@@ -5693,7 +5693,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (local_err) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 
@@ -5705,7 +5705,7 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (ret < 0) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_setg_errno(errp, -ret, "Could not refresh total sector 
count");
-return;
+return ret;
 }
 }
 
@@ -5715,27 +5715,30 @@ static void coroutine_fn 
bdrv_co_invalidate_cache(BlockDriverState *bs,
 if (local_err) {
 bs->open_flags |= BDRV_O_INACTIVE;
 error_propagate(errp, local_err);
-return;
+return -EINVAL;
 }
 }
 }
+
+return 0;
 }
 
 typedef struct InvalidateCacheCo {
 BlockDriverState *bs;
 Error **errp;
 bool done;
+int ret;
 } InvalidateCacheCo;
 
 static void coroutine_fn bdrv_invalidate_cache_co_entry(void *opaque)
 {
 InvalidateCacheCo *ico = opaque;
-bdrv_co_invalidate_cache(ico->bs, ico->errp);
+ico->ret = bdrv_co_invalidate_cache(ico->bs, ico->errp);
 ico->done = true;
 aio_wait_kick();
 }
 
-void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
+int bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
 {
 Coroutine *co;
 InvalidateCacheCo ico = {
@@ -5752,22 +5755,23 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 bdrv_coroutine_enter(bs, co);
 BDRV_POLL_WHILE(bs, !ico.done);
 }
+
+return ico.ret;
 }
 
 void bdrv_invalidate_cache_all(Error **errp)
 {
 BlockDriverState *bs;
-Error *local_err = NULL;
 BdrvNextIterator it;
 
 for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
 AioContext *aio_context = bdrv_get_aio_context(bs);
+int ret;
 
 aio_context_acquire(aio_context);
-bdrv_invalidate_cache(bs, &local_err);
+ret = bdrv_invalidate_cache(bs, errp);
 aio_context_release(aio_context);
-if (local_err) {
-error_propagate(errp, local_err);
+if (ret < 0) {
 bdrv_next_cleanup(&it);
 return;
 }
--

Re: [PULL 00/11] bitmaps patches for 2020-05-26

2020-05-27 Thread Eric Blake


On 5/27/20 3:07 PM, Peter Maydell wrote:



iotest 190 failed on freebsd:




+++ /home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/190.out.bad
  2020-05-27 15:30:50.377759533 +
@@ -17,7 +17,7 @@
  fully allocated size: 10813440
  required size: 219902322
  fully allocated size: 219902322
-required size: 7012352
+required size: 17170432
  fully allocated size: 17170432


Thanks for the heads up.  That was on:

# No bitmap output, since no bitmaps on raw source
$QEMU_IMG measure -O qcow2 -f raw "$TEST_IMG"

and indicates that on FreeBSD, the qcow2 image is not as sparse as it is 
on other platforms.  Where Linux was able to punch holes in the 
underlying filesystem, FreeBSD did not.  But even though I'm not sure if 
that is due to file system hole granularity, choice of APIs used to 
write all-zero bitmaps, or something else, I am certain that it less 
important (the qcow2 file is still quite sparse in comparison to the 2T 
guest-visible data it is representing, even if it differs in sparseness 
between the systems).


I'll post a v2 that filters out the required size for just that command, 
as viewing a qcow2 file as if raw is unusual.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

RE: linux-user - time64 question

2020-05-27 Thread Sid Manning

> -Original Message-
> From: Laurent Vivier 
> Sent: Wednesday, May 27, 2020 11:23 AM
> To: Sid Manning ; qemu-devel@nongnu.org
> Subject: [EXT] Re: linux-user - time64 question
>
> Le 05/05/2020 à 23:38, Sid Manning a écrit :
> > I’m looking at a testcase failure when my target uses 64bit time in
> > msg.h (struct msqid_ds).  I’ve been able to get around this but
> > changing target_msqid_ds like so:
> >
> >
> > @@ -3900,18 +3901,9 @@ static inline abi_long do_semop(int semid,
> > abi_long ptr, unsigned nsops)
> >  struct target_msqid_ds
> >  {
> >  struct target_ipc_perm msg_perm;
> > -abi_ulong msg_stime;
> > -#if TARGET_ABI_BITS == 32
> > -abi_ulong __unused1;
> > -#endif
> > -abi_ulong msg_rtime;
> > -#if TARGET_ABI_BITS == 32
> > -abi_ulong __unused2;
> > -#endif
> > -abi_ulong msg_ctime;
> > -#if TARGET_ABI_BITS == 32
> > -abi_ulong __unused3;
> > -#endif
> > +abi_ullong msg_stime;
> > +abi_ullong msg_rtime;
> > +abi_ullong msg_ctime;
> >  abi_ulong __msg_cbytes;
> >  abi_ulong msg_qnum;
> >  abi_ulong msg_qbytes;
> >
> > It seems like either should have worked but I get garbage back in some
> > of the elements below msg_time fields without the change.
> >
> > If time_t is 64bits then it seems like stime/rtime/ctime should be
> > abi_ullong.
> >
> > My target is Hexagon and the TARGET_ABI_BITS is 32.
>
> The structure has been changed into the kernel for the y2038 and the change
> has not been reflected into qemu (and it needs).

Thanks for the info.  It was also pointed out to me that my C-library was 
missing  a #define, #define IPC_STAT 0x102
This caused the library not to recognize 64-bit time in some instances.


>
> See kernel commit:
>
> c2ab975c30f0 ("y2038: ipc: Report long times to user space")
>
> Thanks,
> Laurent

Re: [PATCH v3 3/3] target/riscv: Drop support for ISA spec version 1.09.1

2020-05-27 Thread Alistair Francis

On Wed, May 27, 2020 at 2:41 AM Bin Meng  wrote:
>
> On Wed, May 27, 2020 at 6:55 AM Alistair Francis
>  wrote:
> >
> > The RISC-V ISA spec version 1.09.1 has been deprecated in QEMU since
> > 4.1. It's not commonly used so let's remove support for it.
> >
> > Signed-off-by: Alistair Francis 
> > ---
> >  target/riscv/cpu.h|   1 -
> >  target/riscv/cpu.c|   2 -
> >  target/riscv/cpu_helper.c |  82 +---
> >  target/riscv/csr.c| 118 +++---
> >  .../riscv/insn_trans/trans_privileged.inc.c   |  18 +--
> >  target/riscv/monitor.c|   5 -
> >  target/riscv/op_helper.c  |  17 +--
> >  7 files changed, 56 insertions(+), 187 deletions(-)
> >
>
> There are 3 more places in csr.c that need to be removed
>
> ./target/riscv/csr.c:651:if (env->priv_ver < PRIV_VERSION_1_10_0) {
> ./target/riscv/csr.c:660:if (env->priv_ver < PRIV_VERSION_1_10_0) {
> ./target/riscv/csr.c:741:} else if (env->priv_ver >= PRIV_VERSION_1_10_0) 
> {

Thanks, fixed.

Alistair

>
> Regards,
> Bin

Re: [PULL 00/11] bitmaps patches for 2020-05-26

2020-05-27 Thread Peter Maydell

On Tue, 26 May 2020 at 17:43, Eric Blake  wrote:
>
> The following changes since commit 8f72c75cfc9b3c84a9b5e7a58ee5e471cb2f19c8:
>
>   Merge remote-tracking branch 
> 'remotes/kraxel/tags/audio-20200526-pull-request' into staging (2020-05-26 
> 10:59:01 +0100)
>
> are available in the Git repository at:
>
>   https://repo.or.cz/qemu/ericb.git tags/pull-bitmaps-2020-05-26
>
> for you to fetch changes up to 9f2bcd08bce234f67239aaf6f4b881ccf0a76d55:
>
>   iotests: Add test 291 to for qemu-img bitmap coverage (2020-05-26 11:28:24 
> -0500)
>
> 
> bitmaps patches for 2020-05-26
>
> - fix non-blockdev migration of bitmaps when mirror job is in use
> - add bitmap sizing to 'qemu-img measure'
> - add 'qemu-img convert --bitmaps'
>
> 


iotest 190 failed on freebsd:

  TESTiotest-qcow2: 190 [fail]
QEMU  --
"/home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/../../aarch64-softmmu/qemu-system-aarch64"
-nodefaults -d
isplay none -machine virt -accel qtest
QEMU_IMG  --
"/home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/../../qemu-img"
QEMU_IO   --
"/home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/../../qemu-io"
--cache writeback --aio threads -f qcow2
QEMU_NBD  --
"/home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/../../qemu-nbd"
IMGFMT-- qcow2 (compat=1.1)
IMGPROTO  -- file
PLATFORM  -- NetBSD/amd64 localhost 9.0
TEST_DIR  -- /home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/scratch
SOCK_DIR  -- /tmp/mktemp.MJWbGXjT
SOCKET_SCM_HELPER --

--- /home/qemu/qemu-test.BE3Bvf/src/tests/qemu-iotests/190.out
2020-05-27 15:13:27.0 +
+++ /home/qemu/qemu-test.BE3Bvf/build/tests/qemu-iotests/190.out.bad
 2020-05-27 15:30:50.377759533 +
@@ -17,7 +17,7 @@
 fully allocated size: 10813440
 required size: 219902322
 fully allocated size: 219902322
-required size: 7012352
+required size: 17170432
 fully allocated size: 17170432
 required size: 335806464
 fully allocated size: 2199359062016

NetBSD failed too but I'm not sure if it was the same thing
or just that it's being very unreliable currently.

thanks
-- PMM

Re: [PATCH v1] pc: Support coldplugging of virtio-pmem-pci devices on all buses

2020-05-27 Thread Pankaj Gupta

Hi David, Vivek,
s
> >> Hi Vivek,
> >>
> >> you have to declare the maxMemory option. Memory devices like
> >> virtio-pmem-pci reside in RAM like a pc-dimm or a nvdimm. If your
> >> virtio-pmem device will be 4GB, you have to add that to maxMemory.
> >>
> >>   64
> >>   68
> >>   64
> >>
> >> (you might have to add "slots='0'" or "slots='1'" to maxMemory to make
> >> libvirt happy)
> >
> > Ok, tried that.
> >
> > 134217728
> >
> > And now it complains about.
> >
> > error: unsupported configuration: At least one numa node has to be 
> > configured when enabling memory hotplug
> >
> > So ultimately it seems to be wanting me to somehow enable memory hotplug
> > to be able to use virtio-pmem?
>
> That's a libvirt error message. Maybe I am confused how libvirt maps
> these parameters to QEMU ...
>
> NVDIMMs under libvirt seem to be easy:
>
> https://www.redhat.com/archives/libvir-list/2016-August/msg00055.html
>
> Maybe the issue is that virtio-pmem has not been properly integrated
> into libvirt yet:
>
> https://www.redhat.com/archives/libvir-list/2019-August/msg7.html
>
> And you attempts to force virtio-pmem in via qemu args does not work
> properly.
>
> Maybe maxMemory in libvirt does not directly map to the QEMU variant to
> define the maximum physical address space reserved also for any memory
> devices (DIMMs, NVDIMMs, virtio-pmem, ...). Any libvirt experts that can
> help?
>
> @Pankaj, did you ever get it to run with libvirt?

I did not run virtio-pmem with libvirt. That requires work at libvirt side.
Created [1] document to run from Qemu command line.

[1] https://github.com/qemu/qemu/blob/master/docs/virtio-pmem.rst

USB pass-through problems

2020-05-27 Thread BALATON Zoltan


Hello,

I've seen a case when QEMU hangs with a passed through USB device. This is 
with -device usb-ehci and pass through with usb-host. This works until the 
attached USB device reboots (so likely it disconnects and reconnects) at 
which point QEMU hangs and need to be SIGKILL-ed to end (that's a bit hard 
to do with mouse and keyboard grabbed). I've got this stack trace:


#0  0x7f23e7bd4949 in poll () at /lib64/libc.so.6
#1  0x7f23e8bfa9a5 in  () at /lib64/libusb-1.0.so.0
#2  0x7f23e8bfbb13 in libusb_handle_events_timeout_completed () at 
/lib64/libusb-1.0.so.0
#3  0x55e09854b7da in usb_host_abort_xfers (s=0x55e09b036dd0) at 
hw/usb/host-libusb.c:963
#4  0x55e09854b87a in usb_host_close (s=0x55e09b036dd0) at 
hw/usb/host-libusb.c:977
#5  0x55e09854b92e in usb_host_nodev_bh (opaque=0x55e09b036dd0) at 
hw/usb/host-libusb.c:998
#6  0x55e098757500 in aio_bh_call (bh=0x55e099ad9cc0) at util/async.c:136
#7  0x55e09875760a in aio_bh_poll (ctx=0x55e0996c2620) at util/async.c:164
#8  0x55e09875cb2a in aio_dispatch (ctx=0x55e0996c2620) at 
util/aio-posix.c:380
#9  0x55e098757a3d in aio_ctx_dispatch (source=0x55e0996c2620, 
callback=0x0, user_data=0x0) at util/async.c:306
#10 0x7f23e8c59665 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#11 0x55e09875b0a9 in glib_pollfds_poll () at util/main-loop.c:219
#12 0x55e09875b123 in os_host_main_loop_wait (timeout=0) at 
util/main-loop.c:242
#13 0x55e09875b228 in main_loop_wait (nonblocking=0) at util/main-loop.c:518
#14 0x55e0982c91f8 in qemu_main_loop () at softmmu/vl.c:1664
#15 0x55e098162e7e in main (argc=, argv=, 
envp=) at softmmu/main.c:49

so the problem may be in libusb but QEMU should not hang completely. The 
host is Linux with libusb 1.0.22.


Another problem I've seen with xhci is here: 
https://bugs.launchpad.net/qemu/+bug/181
but since the title does not clearly say it's a usb issue maybe you've 
missed that.


Regards,
BALATON Zoltan

[Bug 1881004] [NEW] fpu/softfloat.c: error: bitwise negation of a boolean expression

2020-05-27 Thread Philippe Mathieu-Daudé

Public bug reported:

Last time I built QEMU was on commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae,
I just pulled to fea8f3ed739536fca027cf56af7f5576f37ef9cd and now get:
 
  CC  lm32-softmmu/fpu/softfloat.o
fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~
!
fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
 ^
 !
fpu/softfloat.c:3483:18: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
 ^
 !
fpu/softfloat.c:3606:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
zSig &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~
!
fpu/softfloat.c:3760:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
zSig &= ~ ( ( ( roundBits ^ 0x200 ) == 0 ) & roundNearestEven );
^~~
!
fpu/softfloat.c:3987:21: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
^
!
fpu/softfloat.c:4003:22: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
zSig0 &= ~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
 ^
 !
fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven );
 ^~~
 !
8 errors generated.

$ clang -v
clang version 10.0.0-4ubuntu1 
Target: aarch64-unknown-linux-gnu

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 20.04 LTS
Release:20.04
Codename:   focal

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1881004

Title:
  fpu/softfloat.c: error: bitwise negation of a boolean expression

Status in QEMU:
  New

Bug description:
  Last time I built QEMU was on commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae,
  I just pulled to fea8f3ed739536fca027cf56af7f5576f37ef9cd and now get:
   
CC  lm32-softmmu/fpu/softfloat.o
  fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
  ^~
  !
  fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
   ^
   !
  fpu/softfloat.c:3483:18: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
   ^
   !
  fpu/softfloat.c:3606:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  zSig &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
  ^~
  !
  fpu/softfloat.c:3760:13: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  zSig &= ~ ( ( ( roundBits ^ 0x200 ) == 0 ) & roundNearestEven );
  ^~~
  !
  fpu/softfloat.c:3987:21: error: bitwise negation of a boolean expression; did 
you mean logical negation? [-Werror,-Wbool-operation]
  ~ ( ( (uint64_t) ( zSig1

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Peter Xu

On Wed, May 27, 2020 at 07:06:51PM +0200, Cornelia Huck wrote:
> Personally, I find traces to be quite handy, and it's nice if you can
> just enable more of them if they are in your debugging workflow anyway.
> Probably boils down to a matter of preference :)

Totally agree.  I am actually a heavy user of QEMU tracing system, just like
the rest of the tracing tools all over the world... :)

IMHO the difference between a tracepoint and a manual printf() is majorly the
reusablility part - if a debugging printf() is likely to be reused in the
future, then it is a good tracepoint candidate.

Thanks,

-- 
Peter Xu

Re: [PATCH v4 9/9] iotests: rename and move 169 and 199 tests

2020-05-27 Thread Vladimir Sementsov-Ogievskiy


21.05.2020 21:32, John Snow wrote:



On 5/19/20 7:41 AM, Kevin Wolf wrote:

Am 19.05.2020 um 13:32 hat Vladimir Sementsov-Ogievskiy geschrieben:

19.05.2020 12:07, Kevin Wolf wrote:

Am 18.05.2020 um 18:12 hat Thomas Huth geschrieben:

On 15/05/2020 23.15, Vladimir Sementsov-Ogievskiy wrote:

Rename bitmaps migration tests and move them to tests subdirectory to
demonstrate new human-friendly test naming.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   tests/qemu-iotests/{199 => tests/migrate-bitmaps-postcopy-test}   | 0
   .../{199.out => tests/migrate-bitmaps-postcopy-test.out}  | 0
   tests/qemu-iotests/{169 => tests/migrate-bitmaps-test}| 0
   tests/qemu-iotests/{169.out => tests/migrate-bitmaps-test.out}| 0
   4 files changed, 0 insertions(+), 0 deletions(-)
   rename tests/qemu-iotests/{199 => tests/migrate-bitmaps-postcopy-test} (100%)
   rename tests/qemu-iotests/{199.out => 
tests/migrate-bitmaps-postcopy-test.out} (100%)
   rename tests/qemu-iotests/{169 => tests/migrate-bitmaps-test} (100%)
   rename tests/qemu-iotests/{169.out => tests/migrate-bitmaps-test.out} (100%)

diff --git a/tests/qemu-iotests/199 
b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
similarity index 100%
rename from tests/qemu-iotests/199
rename to tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
diff --git a/tests/qemu-iotests/199.out 
b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test.out
similarity index 100%
rename from tests/qemu-iotests/199.out
rename to tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test.out
diff --git a/tests/qemu-iotests/169 
b/tests/qemu-iotests/tests/migrate-bitmaps-test
similarity index 100%
rename from tests/qemu-iotests/169
rename to tests/qemu-iotests/tests/migrate-bitmaps-test
diff --git a/tests/qemu-iotests/169.out 
b/tests/qemu-iotests/tests/migrate-bitmaps-test.out
similarity index 100%
rename from tests/qemu-iotests/169.out
rename to tests/qemu-iotests/tests/migrate-bitmaps-test.out


I like the idea ... but the path name + file names get now quite long.
While you're at it, what about renaming the "qemu-iotests" directory to
just "iotests" or even just "io" now?


Renames are always kind of painful. Do we have a real reason for the
rename except that the paths feel a bit long subjectively?

Of course, if we're renaming all files anyway, changing the directory
name at the same time shouldn't give any additional pain, so it would be
completely reasonable then. We're not renaming the test harness files,
though, and even only two test cases in this patch.

Maybe this final patch should stay RFC until we have the infrastructure
in and then we can have a single series that moves all tests and also
renames the directory? Maybe a not strictly necessary rename of the
tooling would be bearable in the context of a mass rename of tests.


I'm absolutely not hurrying about this thing. And actual aim of the
series is another. I even doubt that we will mass rename the tests:
who knows what they all test?) I don't.


Good point.

And conversely, there are a few test cases that I do know (like 026 030
040 041 055) and probably wouldn't recognise for a while after a rename.
:-)


Still we may rename some tests, and we'll create new named tests which
is good enough.. OK, if I resend a new version, I'll add an RFC patch
on renaming the directory, up to maintainers, take it now or not :)


I guess a final patch to rename the directory as an RFC makes sense.
Then we can continue the discussion there and decide whether or not to
apply it without holding up the rest of the series.

I think I would be inclined to leave the name unchanged as long as we
don't have a real reason, but if people overwhelmingly think otherwise,
we can still rename.



It's a pinch long, but it's not a big ordeal. I wouldn't object to
.tests/io as long as all of the renames happened all at once.

I'll admit to not having read the series, but I will miss the ability to
specify ranges of tests. I'm not sure how that happens under the new
proposal; but I think it's a strict improvement to give the tests human
readable names in the filesystem.



Hmm. Actually, I think, it's not a problem to continue support ranges of tests 
for number test names, if you need it.

Note a new paramter --start-from, which is here to re-run failed  ./check run 
from the middle of the process, is it your use-case or not?


--
Best regards,
Vladimir

[PULL 0/1] register-api queue

2020-05-27 Thread Alistair Francis

The following changes since commit 06539ebc76b8625587aa78d646a9d8d5fddf84f3:

  Merge remote-tracking branch 
'remotes/philmd-gitlab/tags/mips-hw-next-20200526' into staging (2020-05-26 
20:25:06 +0100)

are available in the Git repository at:

  g...@github.com:alistair23/qemu.git tags/pull-register-api-20200527

for you to fetch changes up to 5932a46c8a419db4a6402ac8ae42953b4d4fef1e:

  hw/registerfields: Prefix local variables with underscore in macros 
(2020-05-27 11:23:07 -0700)


A single patch to avoid clashes with the regiser field macros.


Philippe Mathieu-Daudé (1):
  hw/registerfields: Prefix local variables with underscore in macros

 include/hw/registerfields.h | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

[PULL 1/1] hw/registerfields: Prefix local variables with underscore in macros

2020-05-27 Thread Alistair Francis

From: Philippe Mathieu-Daudé 

One can name a local variable holding a value as 'v', but it
currently clashes with the registerfields macros. To save others
to debug the same mistake, prefix the macro's local variables
with an underscore.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Message-id: 20200510203457.10546-1-f4...@amsat.org
Message-Id: <20200510203457.10546-1-f4...@amsat.org>
Signed-off-by: Alistair Francis 
---
 include/hw/registerfields.h | 40 ++---
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/include/hw/registerfields.h b/include/hw/registerfields.h
index 0407edb7ec..93fa4a84c2 100644
--- a/include/hw/registerfields.h
+++ b/include/hw/registerfields.h
@@ -66,35 +66,35 @@
 #define FIELD_DP8(storage, reg, field, val) ({\
 struct {  \
 unsigned int v:R_ ## reg ## _ ## field ## _LENGTH;\
-} v = { .v = val };   \
-uint8_t d;\
-d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,   \
-  R_ ## reg ## _ ## field ## _LENGTH, v.v);   \
-d; })
+} _v = { .v = val };  \
+uint8_t _d;   \
+_d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,  \
+  R_ ## reg ## _ ## field ## _LENGTH, _v.v);  \
+_d; })
 #define FIELD_DP16(storage, reg, field, val) ({   \
 struct {  \
 unsigned int v:R_ ## reg ## _ ## field ## _LENGTH;\
-} v = { .v = val };   \
-uint16_t d;   \
-d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,   \
-  R_ ## reg ## _ ## field ## _LENGTH, v.v);   \
-d; })
+} _v = { .v = val };  \
+uint16_t _d;  \
+_d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,  \
+  R_ ## reg ## _ ## field ## _LENGTH, _v.v);  \
+_d; })
 #define FIELD_DP32(storage, reg, field, val) ({   \
 struct {  \
 unsigned int v:R_ ## reg ## _ ## field ## _LENGTH;\
-} v = { .v = val };   \
-uint32_t d;   \
-d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,   \
-  R_ ## reg ## _ ## field ## _LENGTH, v.v);   \
-d; })
+} _v = { .v = val };  \
+uint32_t _d;  \
+_d = deposit32((storage), R_ ## reg ## _ ## field ## _SHIFT,  \
+  R_ ## reg ## _ ## field ## _LENGTH, _v.v);  \
+_d; })
 #define FIELD_DP64(storage, reg, field, val) ({   \
 struct {  \
 unsigned int v:R_ ## reg ## _ ## field ## _LENGTH;\
-} v = { .v = val };   \
-uint64_t d;   \
-d = deposit64((storage), R_ ## reg ## _ ## field ## _SHIFT,   \
-  R_ ## reg ## _ ## field ## _LENGTH, v.v);   \
-d; })
+} _v = { .v = val };  \
+uint64_t _d;  \
+_d = deposit64((storage), R_ ## reg ## _ ## field ## _SHIFT,  \
+  R_ ## reg ## _ ## field ## _LENGTH, _v.v);  \
+_d; })
 
 /* Deposit a field to array of registers.  */
 #define ARRAY_FIELD_DP32(regs, reg, field, val)   \
-- 
2.26.2

Re: [PATCH v7 32/32] iotests: Add tests for qcow2 images with extended L2 entries

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

Signed-off-by: Alberto Garcia 
---
  tests/qemu-iotests/271 | 705 +
  tests/qemu-iotests/271.out | 603 +++
  tests/qemu-iotests/group   |   1 +
  3 files changed, 1309 insertions(+)
  create mode 100755 tests/qemu-iotests/271
  create mode 100644 tests/qemu-iotests/271.out




+_cleanup()
+{
+   _cleanup_test_img
+rm -f "$TEST_IMG.raw"


TAB damage.



+
+l2_offset=262144 # 0x4


If desired, you could write:

l2_offset=$((0x4))

if that is more legible.


+
+_verify_img()
+{
+$QEMU_IMG compare "$TEST_IMG" "$TEST_IMG.raw" | grep -v 'Images are 
identical'
+$QEMU_IMG check "$TEST_IMG" | _filter_qemu_img_check | \
+grep -v 'No errors were found on the image'
+}
+
+# Compare the bitmap of an extended L2 entry against an expected value
+_verify_l2_bitmap()
+{
+entry_no="$1"# L2 entry number, starting from 0
+expected_alloc="$alloc"  # Space-separated list of allocated subcluster 
indexes
+expected_zero="$zero"# Space-separated list of zero subcluster indexes
+
+offset=$(($l2_offset + $entry_no * 16))
+entry=`peek_file_be "$TEST_IMG" $offset 8`


I find $() easier to read than ``.


+offset=$(($offset + 8))
+bitmap=`peek_file_be "$TEST_IMG" $offset 8`
+
+expected_bitmap=0
+for bit in $expected_alloc; do
+expected_bitmap=$(($expected_bitmap | (1 << $bit)))
+done
+for bit in $expected_zero; do
+expected_bitmap=$(($expected_bitmap | (1 << (32 + $bit
+done
+expected_bitmap=`printf "%llu" $expected_bitmap`


Dead statement - expected_bitmap is already a 64-bit decimal number 
without reprinting it to itself.



+
+printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
+if [ "$bitmap" != "$expected_bitmap" ]; then
+printf "ERROR: expecting bitmap   0x%016lx\n" "$expected_bitmap"
+fi
+}
+
+_test_write()
+{
+cmd="$1"
+l2_entry_idx="$2"
+[ -n "$l2_entry_idx" ] || l2_entry_idx=0


Could shorten these two lines into the one-liner:

l2_entry_idx=${2:-0}


+raw_cmd=`echo $cmd | sed s/-c//` # Raw images don't support -c
+echo "$cmd"
+$QEMU_IO -c "$cmd" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "$raw_cmd" -f raw "$TEST_IMG.raw" | _filter_qemu_io
+_verify_img
+_verify_l2_bitmap "$l2_entry_idx"
+}
+
+_reset_img()
+{
+size="$1"
+$QEMU_IMG create -f raw "$TEST_IMG.raw" "$size" | _filter_img_create
+if [ "$use_backing_file" = "yes" ]; then
+$QEMU_IMG create -f raw "$TEST_IMG.base" "$size" | _filter_img_create
+$QEMU_IO -c "write -q -P 0xFF 0 $size" -f raw "$TEST_IMG.base" | 
_filter_qemu_io
+$QEMU_IO -c "write -q -P 0xFF 0 $size" -f raw "$TEST_IMG.raw" | 
_filter_qemu_io
+_make_test_img -o extended_l2=on -b "$TEST_IMG.base" "$size"


Semantic conflict with my patches to deprecate omitting -F when creating 
images with -b.  If you don't add '-F $IMGFMT' here, then my series will 
have to do it.



+else
+_make_test_img -o extended_l2=on "$size"
+fi
+}
+
+# Test that writing to an image with subclusters produces the expected
+# results, in images with and without backing files
+for use_backing_file in yes no; do
+echo
+echo "### Standard write tests (backing file: $use_backing_file) ###"
+echo
+_reset_img 1M
+### Write subcluster #0 (beginning of subcluster) ###
+alloc="0"; zero=""
+_test_write 'write -q -P 1 0 1k'
+
+### Write subcluster #1 (middle of subcluster) ###
+alloc="0 1"; zero=""
+_test_write 'write -q -P 2 3k 512'
+
+### Write subcluster #2 (end of subcluster) ###
+alloc="0 1 2"; zero=""
+_test_write 'write -q -P 3 5k 1k'
+
+### Write subcluster #3 (full subcluster) ###
+alloc="0 1 2 3"; zero=""
+_test_write 'write -q -P 4 6k 2k'
+
+### Write subclusters #4-6 (full subclusters) ###
+alloc="`seq 0 6`"; zero=""


More places where `` could be changed to $()


+_test_write 'write -q -P 5 8k 6k'
+
+### Write subclusters #7-9 (partial subclusters) ###
+alloc="`seq 0 9`"; zero=""
+_test_write 'write -q -P 6 15k 4k'
+
+### Write subcluster #16 (partial subcluster) ###
+alloc="`seq 0 9` 16"; zero=""
+_test_write 'write -q -P 7 32k 1k'
+
+### Write subcluster #31-#33 (cluster overlap) ###
+alloc="`seq 0 9` 16 31"; zero=""
+_test_write 'write -q -P 8 63k 4k'
+alloc="0 1" ; zero=""
+_verify_l2_bitmap 1
+
+### Zero subcluster #1
+alloc="0 `seq 2 9` 16 31"; zero="1"
+_test_write 'write -q -z 2k 2k'
+
+### Zero cluster #0
+alloc=""; zero="`seq 0 31`"
+_test_write 'write -q -z 0 64k'
+
+### Fill cluster #0 with data
+alloc="`seq 0 31`"; zero=""
+_test_write 'write -q -P 9 0 64k'
+
+### Zero and unmap half of cluster #0 (this won't unmap it)
+alloc="`seq 16 31`"; zero="`seq 0 15`"
+_test_write 'write -q -z

Re: [PATCH v7 30/32] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

Now that the implementation of subclusters is complete we can finally
add the necessary options to create and read images with this feature,
which we call "extended L2 entries".

Signed-off-by: Alberto Garcia 
---


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 28/32] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

This works now at the subcluster level and pwrite_zeroes_alignment is
updated accordingly.

qcow2_cluster_zeroize() is turned into qcow2_subcluster_zeroize() with
the following changes:

- The request can now be subcluster-aligned.

- The cluster-aligned body of the request is still zeroized using
  zero_in_l2_slice() as before.

- The subcluster-aligned head and tail of the request are zeroized
  with the new zero_l2_subclusters() function.

There is just one thing to take into account for a possible future
improvement: compressed clusters cannot be partially zeroized so
zero_l2_subclusters() on the head or the tail can return -ENOTSUP.
This makes the caller repeat the *complete* request and write actual
zeroes to disk. This is sub-optimal because

1) if the head area was compressed we would still be able to use
   the fast path for the body and possibly the tail.

2) if the tail area was compressed we are writing zeroes to the
   head and the body areas, which are already zeroized.


Is this true?  The block layer tries hard to break zero requests up so 
that any non-cluster-aligned requests do not cross cluster boundaries. 
In practice, that means that when you have an unaligned request, the 
head and tail cluster will be the same cluster, and there is no body in 
play, so that returning -ENOTSUP is correct because there really is no 
other work to do and repeating the entire request (which is less than a 
cluster in length) is the right approach.




Signed-off-by: Alberto Garcia 
---
  block/qcow2.h |  4 +--
  block/qcow2-cluster.c | 80 +++
  block/qcow2.c | 27 ---
  3 files changed, 90 insertions(+), 21 deletions(-)


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 2/4] coroutine: Add check for SafeStack in sigalstack

2020-05-27 Thread Daniele Buono


Sorry, missed the question at the end of the email.
Will change the commit and error message to explain better in v2.

Similar to the ucontext, case, sigaltstack does not work out of the box
because it requires a stack to be allocated by the user.

I'll be honest, I didn't check the details of how sigaltstack is used in
coroutine-sigaltstack. It is very possible that this coroutine version
can also be adapted to run properly with SafeStack.
coroutine_trampoline is probably the place where we can set the usp so
that, when coroutine_bootstrap is called, it is done with the new unsafe
stack.

My initial focus was on the ucontext implementation since that is the
default one and mostly used. For now, I am just blocking the user from
using coroutine-sigaltstack with SafeStack.

However, if you are interested, Ican try to add support to sigaltstack
in the future.

On 5/21/2020 5:49 AM, Stefan Hajnoczi wrote:

On Wed, Apr 29, 2020 at 03:44:18PM -0400, Daniele Buono wrote:

s/sigalstack/sigaltstack/ in the commit message.


LLVM's SafeStack instrumentation cannot be used inside signal handlers
that make use of sigaltstack().
Since coroutine-sigaltstack relies on sigaltstack(), it is not
compatible with SafeStack. The resulting binary is incorrect, with
different coroutines sharing the same unsafe stack and producing
undefined behavior at runtime.
To avoid this, we add a check in coroutine-sigaltstack that throws a
preprocessor #error and interrupt the compilation if SafeStack is
enabled.

Signed-off-by: Daniele Buono 
---
  util/coroutine-sigaltstack.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index f6fc49a0e5..b7cdc959f8 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -30,6 +30,10 @@
  #include "qemu-common.h"
  #include "qemu/coroutine_int.h"
  
+#ifdef CONFIG_SAFESTACK

+#error "SafeStack does not work with sigaltstack's implementation"
+#endif


Neither the commit description nor the #error message explain why it
doesn't work. Could it work in the future or is there a fundamental
reason why it will never work?

Stefan

Re: [PATCH v3 0/3] RTISC-V: Remove deprecated ISA, CPUs and machines

2020-05-27 Thread Alistair Francis

On Wed, May 27, 2020 at 12:17 AM Thomas Huth  wrote:
>
> On 27/05/2020 00.47, Alistair Francis wrote:
> >
> >  include/hw/riscv/spike.h  |   6 +-
> >  target/riscv/cpu.h|   8 -
> >  hw/riscv/spike.c  | 217 --
> >  target/riscv/cpu.c|  30 ---
> >  target/riscv/cpu_helper.c |  82 +++
> >  target/riscv/csr.c| 118 ++
> >  .../riscv/insn_trans/trans_privileged.inc.c   |  18 +-
> >  target/riscv/monitor.c|   5 -
> >  target/riscv/op_helper.c  |  17 +-
> >  tests/qtest/machine-none-test.c   |   4 +-
> >  10 files changed, 60 insertions(+), 445 deletions(-)
>
>  Hi,
>
> looking at the diffstat, I think you should also edit
> ./docs/system/deprecated.rst according to your changes?

I'm not sure what I should edit there. These are already marked as
deprecated, do I need to make a change when they are removed?

Alistair

>
>  Thomas
>

Re: [PATCH 3/7] GitLab CI: avoid calling before_scripts on unintended jobs

2020-05-27 Thread Alex Bennée



Thomas Huth  writes:

> From: Cleber Rosa 
>
> At this point it seems that all jobs depend on those steps, with
> maybe the EDK2 jobs as exceptions.
>
> The jobs that will be added later will not want those scripts to be
> run, so let's move these steps to the appropriate jobs, while
> still trying to avoid repetition.
>
> Signed-off-by: Cleber Rosa 
> [thuth: Rebased to current master branch, use separate template]
> Signed-off-by: Thomas Huth 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée

Re: [PATCH 4/7] gitlab-ci: Move edk2 and opensbi YAML files to .gitlab-ci.d folder

2020-05-27 Thread Alex Bennée



Thomas Huth  writes:

> We have a dedicated folder for the gitlab-ci - so there is no need
> to clutter the top directory with these .yml files.
>
> Signed-off-by: Thomas Huth 

Reviewed-by: Alex Bennée 

> ---
>  .gitlab-ci-edk2.yml => .gitlab-ci.d/edk2.yml   | 0
>  .gitlab-ci-opensbi.yml => .gitlab-ci.d/opensbi.yml | 0
>  .gitlab-ci.yml | 4 ++--
>  MAINTAINERS| 2 +-
>  4 files changed, 3 insertions(+), 3 deletions(-)
>  rename .gitlab-ci-edk2.yml => .gitlab-ci.d/edk2.yml (100%)
>  rename .gitlab-ci-opensbi.yml => .gitlab-ci.d/opensbi.yml (100%)
>
> diff --git a/.gitlab-ci-edk2.yml b/.gitlab-ci.d/edk2.yml
> similarity index 100%
> rename from .gitlab-ci-edk2.yml
> rename to .gitlab-ci.d/edk2.yml
> diff --git a/.gitlab-ci-opensbi.yml b/.gitlab-ci.d/opensbi.yml
> similarity index 100%
> rename from .gitlab-ci-opensbi.yml
> rename to .gitlab-ci.d/opensbi.yml
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index bc6aee6aba..5208d93ff8 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -1,6 +1,6 @@
>  include:
> -  - local: '/.gitlab-ci-edk2.yml'
> -  - local: '/.gitlab-ci-opensbi.yml'
> +  - local: '/.gitlab-ci.d/edk2.yml'
> +  - local: '/.gitlab-ci.d/opensbi.yml'
>  
>  .update_apt_template: &before_script_apt
>   before_script:
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bde5fd480f..d43c98115c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2534,7 +2534,7 @@ F: roms/edk2
>  F: roms/edk2-*
>  F: tests/data/uefi-boot-images/
>  F: tests/uefi-test-tools/
> -F: .gitlab-ci-edk2.yml
> +F: .gitlab-ci.d/edk2.yml
>  F: .gitlab-ci.d/edk2/
>  
>  Usermode Emulation


-- 
Alex Bennée

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Cornelia Huck

On Wed, 27 May 2020 12:53:30 -0400
Peter Xu  wrote:

> On Wed, May 27, 2020 at 06:27:38PM +0200, Philippe Mathieu-Daudé wrote:
> > On 5/27/20 6:16 PM, Peter Xu wrote:  
> > > On Wed, May 27, 2020 at 05:53:16PM +0200, Philippe Mathieu-Daudé wrote:  
> > > +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
> > > +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, 
> > > iommu[i].type)) {
> > > +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name); 
> > >  
> >  Just wondering why you want to trace the type as you now have the name
> >  string.  
> > >>>
> > >>> You are right :)
> > >>>  
> > > +return iommu[i].type;
> > >  }
> > >  }
> > > +trace_vfio_get_iommu_type(-1, "Not available or not supported"); 
> > >  
> >  nit: from a debugging pov, this may be not needed as
> >  vfio_get_group/vfio_connect_container() fails and this leads to an 
> >  error
> >  output.  
> > >>
> > >> But you can reach this for example using No-IOMMU. If you don't mind, I
> > >> find having this information in the trace log clearer.  
> > > 
> > > I kinda agree with Eric - AFAICT QEMU vfio-pci don't work with no-iommu, 
> > > then
> > > it seems meaningless to trace it...
> > > 
> > > I'm not sure whether this trace is extremely helpful because syscalls 
> > > like this
> > > could be easily traced by things like strace or bpftrace as general tools 
> > > (and
> > > this information should be a one-time thing rather than dynamically 
> > > changing),
> > > no strong opinion though.  Also, if we want to dump something, maybe it's
> > > better to do in vfio_init_container() after vfio_get_iommu_type() 
> > > succeeded, so
> > > we dump which container is enabled with what type of iommu.  
> > 
> > OK. I'm a recent VFIO user so maybe I am not using the good information.
> > 
> > This trace helps me while working on a new device feature, I didn't
> > thought about gathering it in a production because there I'd expect
> > things to work.
> > 
> > Now in my case what I want is to know is if I'm using a v1 or v2 type.
> > Maybe this information is already available in /proc or /sys and we
> > don't need this patch...  
> 
> I don't know such /proc or /sys, so maybe it's still useful. I guess Alex 
> would
> have the best judgement. The strace/bpftrace things are not really reasons I
> found to nak this patch, but just something I thought first that could be
> easier when any of us wants to peak at those information, probably something
> just FYI. :-)

Personally, I find traces to be quite handy, and it's nice if you can
just enable more of them if they are in your debugging workflow anyway.
Probably boils down to a matter of preference :)

[PATCH v4 09/10] riscv/opentitan: Connect the UART device

2020-05-27 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/riscv/opentitan.h | 13 +
 hw/riscv/opentitan.c | 24 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 8d6a09b696..825a3610bc 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -21,6 +21,7 @@
 
 #include "hw/riscv/riscv_hart.h"
 #include "hw/intc/ibex_plic.h"
+#include "hw/char/ibex_uart.h"
 
 #define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
 #define RISCV_IBEX_SOC(obj) \
@@ -33,6 +34,7 @@ typedef struct LowRISCIbexSoCState {
 /*< public >*/
 RISCVHartArrayState cpus;
 IbexPlicState plic;
+IbexUartState uart;
 
 MemoryRegion flash_mem;
 MemoryRegion rom;
@@ -63,4 +65,15 @@ enum {
 IBEX_USBDEV,
 };
 
+enum {
+IBEX_UART_RX_PARITY_ERR_IRQ = 0x28,
+IBEX_UART_RX_TIMEOUT_IRQ = 0x27,
+IBEX_UART_RX_BREAK_ERR_IRQ = 0x26,
+IBEX_UART_RX_FRAME_ERR_IRQ = 0x25,
+IBEX_UART_RX_OVERFLOW_IRQ = 0x24,
+IBEX_UART_TX_EMPTY_IRQ = 0x23,
+IBEX_UART_RX_WATERMARK_IRQ = 0x22,
+IBEX_UART_TX_WATERMARK_IRQ = 0x21
+};
+
 #endif
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index 3926321d8c..a6c0b949ca 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -96,6 +96,9 @@ static void riscv_lowrisc_ibex_soc_init(Object *obj)
 
 sysbus_init_child_obj(obj, "plic", &s->plic,
   sizeof(s->plic), TYPE_IBEX_PLIC);
+
+sysbus_init_child_obj(obj, "uart", &s->uart,
+  sizeof(s->uart), TYPE_IBEX_UART);
 }
 
 static void riscv_lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp)
@@ -137,8 +140,25 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 busdev = SYS_BUS_DEVICE(dev);
 sysbus_mmio_map(busdev, 0, memmap[IBEX_PLIC].base);
 
-create_unimplemented_device("riscv.lowrisc.ibex.uart",
-memmap[IBEX_UART].base, memmap[IBEX_UART].size);
+/* UART */
+dev = DEVICE(&(s->uart));
+qdev_prop_set_chr(dev, "chardev", serial_hd(0));
+object_property_set_bool(OBJECT(&s->uart), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+busdev = SYS_BUS_DEVICE(dev);
+sysbus_mmio_map(busdev, 0, memmap[IBEX_UART].base);
+sysbus_connect_irq(busdev, 0, qdev_get_gpio_in(DEVICE(&s->plic),
+   IBEX_UART_TX_WATERMARK_IRQ));
+sysbus_connect_irq(busdev, 1, qdev_get_gpio_in(DEVICE(&s->plic),
+   IBEX_UART_RX_WATERMARK_IRQ));
+sysbus_connect_irq(busdev, 2, qdev_get_gpio_in(DEVICE(&s->plic),
+   IBEX_UART_TX_EMPTY_IRQ));
+sysbus_connect_irq(busdev, 3, qdev_get_gpio_in(DEVICE(&s->plic),
+   IBEX_UART_RX_OVERFLOW_IRQ));
+
 create_unimplemented_device("riscv.lowrisc.ibex.gpio",
 memmap[IBEX_GPIO].base, memmap[IBEX_GPIO].size);
 create_unimplemented_device("riscv.lowrisc.ibex.spi",
-- 
2.26.2

[PATCH v4 07/10] hw/intc: Initial commit of lowRISC Ibex PLIC

2020-05-27 Thread Alistair Francis

The Ibex core contains a PLIC that although similar to the RISC-V spec
is not RISC-V spec compliant.

This patch implements a Ibex PLIC in a somewhat generic way.

As the current RISC-V PLIC needs tidying up, my hope is that as the Ibex
PLIC move towards spec compliance this PLIC implementation can be
updated until it can replace the current PLIC.

Signed-off-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/intc/ibex_plic.h |  63 +
 hw/intc/ibex_plic.c | 261 
 MAINTAINERS |   2 +
 hw/intc/Makefile.objs   |   1 +
 4 files changed, 327 insertions(+)
 create mode 100644 include/hw/intc/ibex_plic.h
 create mode 100644 hw/intc/ibex_plic.c

diff --git a/include/hw/intc/ibex_plic.h b/include/hw/intc/ibex_plic.h
new file mode 100644
index 00..ddc7909903
--- /dev/null
+++ b/include/hw/intc/ibex_plic.h
@@ -0,0 +1,63 @@
+/*
+ * QEMU RISC-V lowRISC Ibex PLIC
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef HW_IBEX_PLIC_H
+#define HW_IBEX_PLIC_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_IBEX_PLIC "ibex-plic"
+#define IBEX_PLIC(obj) \
+OBJECT_CHECK(IbexPlicState, (obj), TYPE_IBEX_PLIC)
+
+typedef struct IbexPlicState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+MemoryRegion mmio;
+
+uint32_t *pending;
+uint32_t *source;
+uint32_t *priority;
+uint32_t *enable;
+uint32_t threshold;
+uint32_t claim;
+
+/* config */
+uint32_t num_cpus;
+uint32_t num_sources;
+
+uint32_t pending_base;
+uint32_t pending_num;
+
+uint32_t source_base;
+uint32_t source_num;
+
+uint32_t priority_base;
+uint32_t priority_num;
+
+uint32_t enable_base;
+uint32_t enable_num;
+
+uint32_t threshold_base;
+
+uint32_t claim_base;
+} IbexPlicState;
+
+#endif /* HW_IBEX_PLIC_H */
diff --git a/hw/intc/ibex_plic.c b/hw/intc/ibex_plic.c
new file mode 100644
index 00..41079518c6
--- /dev/null
+++ b/hw/intc/ibex_plic.c
@@ -0,0 +1,261 @@
+/*
+ * QEMU RISC-V lowRISC Ibex PLIC
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Documentation avaliable: https://docs.opentitan.org/hw/ip/rv_plic/doc/
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/qdev-properties.h"
+#include "hw/core/cpu.h"
+#include "hw/boards.h"
+#include "hw/pci/msi.h"
+#include "target/riscv/cpu_bits.h"
+#include "target/riscv/cpu.h"
+#include "hw/intc/ibex_plic.h"
+
+static bool addr_between(uint32_t addr, uint32_t base, uint32_t num)
+{
+uint32_t end = base + (num * 0x04);
+
+if (addr >= base && addr < end) {
+return true;
+}
+
+return false;
+}
+
+static void ibex_plic_irqs_set_pending(IbexPlicState *s, int irq, bool level)
+{
+int pending_num = irq / 32;
+
+s->pending[pending_num] |= level << (irq % 32);
+}
+
+static bool ibex_plic_irqs_pending(IbexPlicState *s, uint32_t context)
+{
+int i;
+
+for (i = 0; i < s->pending_num; i++) {
+uint32_t irq_num = ctz64(s->pending[i]) + (i * 32);
+
+if (!(s->pending[i] & s->enable[i])) {
+/* No pending and enabled IRQ */
+continue;
+}
+
+if (s->priority[irq_num] > s->threshold) {
+if (!s->claim) {
+s->claim = irq_num;
+}
+return true;
+}
+}
+
+return false;
+}
+
+static void ibex_plic_update(IbexPlicState *s)
+{
+CPUState *cpu;
+int level, i;
+
+for (i = 0; i < s->num_cpus; i++) {
+cpu = qemu_get_cpu(i);
+
+if (!cpu) {
+continue;
+}
+
+level = ibex_plic_irqs_pending(s, 0);
+
+riscv_cpu_update_mip(RISCV_CPU(cpu), MIP_MEIP, BOOL_TO_MASK(level));
+}
+}
+
+static voi

[PATCH v4 10/10] target/riscv: Use a smaller guess size for no-MMU PMP

2020-05-27 Thread Alistair Francis

Signed-off-by: Alistair Francis 
---
 target/riscv/pmp.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
index 0e6b640fbd..9418660f1b 100644
--- a/target/riscv/pmp.c
+++ b/target/riscv/pmp.c
@@ -233,12 +233,16 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong 
addr,
 return true;
 }
 
-/*
- * if size is unknown (0), assume that all bytes
- * from addr to the end of the page will be accessed.
- */
 if (size == 0) {
-pmp_size = -(addr | TARGET_PAGE_MASK);
+if (riscv_feature(env, RISCV_FEATURE_MMU)) {
+/*
+ * If size is unknown (0), assume that all bytes
+ * from addr to the end of the page will be accessed.
+ */
+pmp_size = -(addr | TARGET_PAGE_MASK);
+} else {
+pmp_size = sizeof(target_ulong);
+}
 } else {
 pmp_size = size;
 }
-- 
2.26.2

[PATCH v4 04/10] target/riscv: Add the lowRISC Ibex CPU

2020-05-27 Thread Alistair Francis

Ibex is a small and efficient, 32-bit, in-order RISC-V core with
a 2-stage pipeline that implements the RV32IMC instruction set
architecture.

For more details on lowRISC see here:
https://github.com/lowRISC/ibex

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: LIU Zhiwei 
---
 target/riscv/cpu.h |  1 +
 target/riscv/cpu.c | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d0e7f5b9c5..8733d7467f 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -35,6 +35,7 @@
 #define TYPE_RISCV_CPU_ANY  RISCV_CPU_TYPE_NAME("any")
 #define TYPE_RISCV_CPU_BASE32   RISCV_CPU_TYPE_NAME("rv32")
 #define TYPE_RISCV_CPU_BASE64   RISCV_CPU_TYPE_NAME("rv64")
+#define TYPE_RISCV_CPU_IBEX RISCV_CPU_TYPE_NAME("lowrisc-ibex")
 #define TYPE_RISCV_CPU_SIFIVE_E31   RISCV_CPU_TYPE_NAME("sifive-e31")
 #define TYPE_RISCV_CPU_SIFIVE_E34   RISCV_CPU_TYPE_NAME("sifive-e34")
 #define TYPE_RISCV_CPU_SIFIVE_E51   RISCV_CPU_TYPE_NAME("sifive-e51")
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8deba3d16d..4761ebad42 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -154,6 +154,16 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
+static void rv32imcu_nommu_cpu_init(Object *obj)
+{
+CPURISCVState *env = &RISCV_CPU(obj)->env;
+set_misa(env, RV32 | RVI | RVM | RVC | RVU);
+set_priv_version(env, PRIV_VERSION_1_10_0);
+set_resetvec(env, 0x8090);
+set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
+}
+
 static void rv32imacu_nommu_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
@@ -618,6 +628,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
 DEFINE_CPU(TYPE_RISCV_CPU_ANY,  riscv_any_cpu_init),
 #if defined(TARGET_RISCV32)
 DEFINE_CPU(TYPE_RISCV_CPU_BASE32,   riscv_base32_cpu_init),
+DEFINE_CPU(TYPE_RISCV_CPU_IBEX, rv32imcu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E31,   rv32imacu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E34,   rv32imafcu_nommu_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U34,   rv32gcsu_priv1_10_0_cpu_init),
-- 
2.26.2

[PATCH v4 02/10] target/riscv: Don't overwrite the reset vector

2020-05-27 Thread Alistair Francis

The reset vector is set in the init function don't set it again in
realize.

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 target/riscv/cpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 059d71f2c7..5eb3c02735 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -133,6 +133,7 @@ static void riscv_base32_cpu_init(Object *obj)
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 /* We set this in the realise function */
 set_misa(env, 0);
+set_resetvec(env, DEFAULT_RSTVEC);
 }
 
 static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
@@ -180,6 +181,7 @@ static void riscv_base64_cpu_init(Object *obj)
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 /* We set this in the realise function */
 set_misa(env, 0);
+set_resetvec(env, DEFAULT_RSTVEC);
 }
 
 static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
@@ -399,7 +401,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 }
 
 set_priv_version(env, priv_version);
-set_resetvec(env, DEFAULT_RSTVEC);
 
 if (cpu->cfg.mmu) {
 set_feature(env, RISCV_FEATURE_MMU);
-- 
2.26.2

[PATCH v4 08/10] riscv/opentitan: Connect the PLIC device

2020-05-27 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/riscv/opentitan.h |  3 +++
 hw/riscv/opentitan.c | 19 +--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 15a3d87ed0..8d6a09b696 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -20,6 +20,7 @@
 #define HW_OPENTITAN_H
 
 #include "hw/riscv/riscv_hart.h"
+#include "hw/intc/ibex_plic.h"
 
 #define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
 #define RISCV_IBEX_SOC(obj) \
@@ -31,6 +32,8 @@ typedef struct LowRISCIbexSoCState {
 
 /*< public >*/
 RISCVHartArrayState cpus;
+IbexPlicState plic;
+
 MemoryRegion flash_mem;
 MemoryRegion rom;
 } LowRISCIbexSoCState;
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index c00f0720ab..3926321d8c 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -25,6 +25,7 @@
 #include "hw/misc/unimp.h"
 #include "hw/riscv/boot.h"
 #include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
 
 static const struct MemmapEntry {
 hwaddr base;
@@ -92,6 +93,9 @@ static void riscv_lowrisc_ibex_soc_init(Object *obj)
 object_initialize_child(obj, "cpus", &s->cpus,
 sizeof(s->cpus), TYPE_RISCV_HART_ARRAY,
 &error_abort, NULL);
+
+sysbus_init_child_obj(obj, "plic", &s->plic,
+  sizeof(s->plic), TYPE_IBEX_PLIC);
 }
 
 static void riscv_lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp)
@@ -100,6 +104,9 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 MachineState *ms = MACHINE(qdev_get_machine());
 LowRISCIbexSoCState *s = RISCV_IBEX_SOC(dev_soc);
 MemoryRegion *sys_mem = get_system_memory();
+DeviceState *dev;
+SysBusDevice *busdev;
+Error *err = NULL;
 
 object_property_set_str(OBJECT(&s->cpus), ms->cpu_type, "cpu-type",
 &error_abort);
@@ -120,6 +127,16 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 memory_region_add_subregion(sys_mem, memmap[IBEX_FLASH].base,
 &s->flash_mem);
 
+/* PLIC */
+dev = DEVICE(&s->plic);
+object_property_set_bool(OBJECT(&s->plic), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+busdev = SYS_BUS_DEVICE(dev);
+sysbus_mmio_map(busdev, 0, memmap[IBEX_PLIC].base);
+
 create_unimplemented_device("riscv.lowrisc.ibex.uart",
 memmap[IBEX_UART].base, memmap[IBEX_UART].size);
 create_unimplemented_device("riscv.lowrisc.ibex.gpio",
@@ -134,8 +151,6 @@ static void riscv_lowrisc_ibex_soc_realize(DeviceState 
*dev_soc, Error **errp)
 memmap[IBEX_AES].base, memmap[IBEX_AES].size);
 create_unimplemented_device("riscv.lowrisc.ibex.hmac",
 memmap[IBEX_HMAC].base, memmap[IBEX_HMAC].size);
-create_unimplemented_device("riscv.lowrisc.ibex.plic",
-memmap[IBEX_PLIC].base, memmap[IBEX_PLIC].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_PINMUX].base, memmap[IBEX_PINMUX].size);
 create_unimplemented_device("riscv.lowrisc.ibex.alert_handler",
-- 
2.26.2

[PATCH v4 03/10] target/riscv: Disable the MMU correctly

2020-05-27 Thread Alistair Francis

Previously if we didn't enable the MMU it would be enabled in the
realize() function anyway. Let's ensure that if we don't want the MMU we
disable it. We also don't need to enable the MMU as it will be enalbed
in realize() by default.

Signed-off-by: Alistair Francis 
---
 target/riscv/cpu.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5eb3c02735..8deba3d16d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -142,7 +142,6 @@ static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -152,7 +151,6 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -163,6 +161,7 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 static void rv32imafcu_nommu_cpu_init(Object *obj)
@@ -172,6 +171,7 @@ static void rv32imafcu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 #elif defined(TARGET_RISCV64)
@@ -190,7 +190,6 @@ static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_09_1);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -200,7 +199,6 @@ static void rv64gcsu_priv1_10_0_cpu_init(Object *obj)
 set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
-set_feature(env, RISCV_FEATURE_MMU);
 set_feature(env, RISCV_FEATURE_PMP);
 }
 
@@ -211,6 +209,7 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
 set_priv_version(env, PRIV_VERSION_1_10_0);
 set_resetvec(env, DEFAULT_RSTVEC);
 set_feature(env, RISCV_FEATURE_PMP);
+qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
 
 #endif
-- 
2.26.2

[PATCH v4 06/10] hw/char: Initial commit of Ibex UART

2020-05-27 Thread Alistair Francis

This is the initial commit of the Ibex UART device. Serial TX is
working, while RX has been implemeneted but untested.

This is based on the documentation from:
https://docs.opentitan.org/hw/ip/uart/doc/

Signed-off-by: Alistair Francis 
---
 include/hw/char/ibex_uart.h | 110 
 hw/char/ibex_uart.c | 492 
 MAINTAINERS |   2 +
 hw/char/Makefile.objs   |   1 +
 hw/riscv/Kconfig|   4 +
 5 files changed, 609 insertions(+)
 create mode 100644 include/hw/char/ibex_uart.h
 create mode 100644 hw/char/ibex_uart.c

diff --git a/include/hw/char/ibex_uart.h b/include/hw/char/ibex_uart.h
new file mode 100644
index 00..2bec772615
--- /dev/null
+++ b/include/hw/char/ibex_uart.h
@@ -0,0 +1,110 @@
+/*
+ * QEMU lowRISC Ibex UART device
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef HW_IBEX_UART_H
+#define HW_IBEX_UART_H
+
+#include "hw/sysbus.h"
+#include "chardev/char-fe.h"
+#include "qemu/timer.h"
+
+#define IBEX_UART_INTR_STATE   0x00
+#define INTR_STATE_TX_WATERMARK (1 << 0)
+#define INTR_STATE_RX_WATERMARK (1 << 1)
+#define INTR_STATE_TX_EMPTY (1 << 2)
+#define INTR_STATE_RX_OVERFLOW  (1 << 3)
+#define IBEX_UART_INTR_ENABLE  0x04
+#define IBEX_UART_INTR_TEST0x08
+
+#define IBEX_UART_CTRL 0x0c
+#define UART_CTRL_TX_ENABLE (1 << 0)
+#define UART_CTRL_RX_ENABLE (1 << 1)
+#define UART_CTRL_NF(1 << 2)
+#define UART_CTRL_SLPBK (1 << 4)
+#define UART_CTRL_LLPBK (1 << 5)
+#define UART_CTRL_PARITY_EN (1 << 6)
+#define UART_CTRL_PARITY_ODD(1 << 7)
+#define UART_CTRL_RXBLVL(3 << 8)
+#define UART_CTRL_NCO   (0x << 16)
+
+#define IBEX_UART_STATUS   0x10
+#define UART_STATUS_TXFULL  (1 << 0)
+#define UART_STATUS_RXFULL  (1 << 1)
+#define UART_STATUS_TXEMPTY (1 << 2)
+#define UART_STATUS_RXIDLE  (1 << 4)
+#define UART_STATUS_RXEMPTY (1 << 5)
+
+#define IBEX_UART_RDATA0x14
+#define IBEX_UART_WDATA0x18
+
+#define IBEX_UART_FIFO_CTRL0x1c
+#define FIFO_CTRL_RXRST  (1 << 0)
+#define FIFO_CTRL_TXRST  (1 << 1)
+#define FIFO_CTRL_RXILVL (7 << 2)
+#define FIFO_CTRL_RXILVL_SHIFT   (2)
+#define FIFO_CTRL_TXILVL (3 << 5)
+#define FIFO_CTRL_TXILVL_SHIFT   (5)
+
+#define IBEX_UART_FIFO_STATUS  0x20
+#define IBEX_UART_OVRD 0x24
+#define IBEX_UART_VAL  0x28
+#define IBEX_UART_TIMEOUT_CTRL 0x2c
+
+#define IBEX_UART_TX_FIFO_SIZE 16
+
+#define TYPE_IBEX_UART "ibex-uart"
+#define IBEX_UART(obj) \
+OBJECT_CHECK(IbexUartState, (obj), TYPE_IBEX_UART)
+
+typedef struct {
+/*  */
+SysBusDevice parent_obj;
+
+/*  */
+MemoryRegion mmio;
+
+uint8_t tx_fifo[IBEX_UART_TX_FIFO_SIZE];
+uint32_t tx_level;
+
+QEMUTimer *fifo_trigger_handle;
+uint64_t char_tx_time;
+
+uint32_t uart_intr_state;
+uint32_t uart_intr_enable;
+uint32_t uart_ctrl;
+uint32_t uart_status;
+uint32_t uart_rdata;
+uint32_t uart_fifo_ctrl;
+uint32_t uart_fifo_status;
+uint32_t uart_ovrd;
+uint32_t uart_val;
+uint32_t uart_timeout_ctrl;
+
+CharBackend chr;
+qemu_irq tx_watermark;
+qemu_irq rx_watermark;
+qemu_irq tx_empty;
+qemu_irq rx_overflow;
+} IbexUartState;
+#endif /* HW_IBEX_UART_H */
diff --git a/hw/char/ibex_uart.c b/hw/char/ibex_uart.c
new file mode 100644
index 00..c416325d73
--- /dev/null
+++ b/hw/char/ibex_uart.c
@@ -0,0 +1,492 @@
+/*
+ * QEMU lowRISC Ibex UART device
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * For details check the documentation here:
+ *https://docs.opentitan.org/hw/ip/uart/doc/
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"),

[PATCH v4 05/10] riscv: Initial commit of OpenTitan machine

2020-05-27 Thread Alistair Francis

This adds a barebone OpenTitan machine to QEMU.

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 default-configs/riscv32-softmmu.mak |   1 +
 default-configs/riscv64-softmmu.mak |  11 +-
 include/hw/riscv/opentitan.h|  63 +++
 hw/riscv/opentitan.c| 169 
 MAINTAINERS |   9 ++
 hw/riscv/Kconfig|   5 +
 hw/riscv/Makefile.objs  |   1 +
 7 files changed, 258 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/riscv/opentitan.h
 create mode 100644 hw/riscv/opentitan.c

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index 1ae077ed87..94a236c9c2 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -10,3 +10,4 @@ CONFIG_SPIKE=y
 CONFIG_SIFIVE_E=y
 CONFIG_SIFIVE_U=y
 CONFIG_RISCV_VIRT=y
+CONFIG_OPENTITAN=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index 235c6f473f..aaf6d735bb 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -1,3 +1,12 @@
 # Default configuration for riscv64-softmmu
 
-include riscv32-softmmu.mak
+# Uncomment the following lines to disable these optional devices:
+#
+#CONFIG_PCI_DEVICES=n
+
+# Boards:
+#
+CONFIG_SPIKE=y
+CONFIG_SIFIVE_E=y
+CONFIG_SIFIVE_U=y
+CONFIG_RISCV_VIRT=y
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
new file mode 100644
index 00..15a3d87ed0
--- /dev/null
+++ b/include/hw/riscv/opentitan.h
@@ -0,0 +1,63 @@
+/*
+ * QEMU RISC-V Board Compatible with OpenTitan FPGA platform
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef HW_OPENTITAN_H
+#define HW_OPENTITAN_H
+
+#include "hw/riscv/riscv_hart.h"
+
+#define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
+#define RISCV_IBEX_SOC(obj) \
+OBJECT_CHECK(LowRISCIbexSoCState, (obj), TYPE_RISCV_IBEX_SOC)
+
+typedef struct LowRISCIbexSoCState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+RISCVHartArrayState cpus;
+MemoryRegion flash_mem;
+MemoryRegion rom;
+} LowRISCIbexSoCState;
+
+typedef struct OpenTitanState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+LowRISCIbexSoCState soc;
+} OpenTitanState;
+
+enum {
+IBEX_ROM,
+IBEX_RAM,
+IBEX_FLASH,
+IBEX_UART,
+IBEX_GPIO,
+IBEX_SPI,
+IBEX_FLASH_CTRL,
+IBEX_RV_TIMER,
+IBEX_AES,
+IBEX_HMAC,
+IBEX_PLIC,
+IBEX_PINMUX,
+IBEX_ALERT_HANDLER,
+IBEX_USBDEV,
+};
+
+#endif
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
new file mode 100644
index 00..c00f0720ab
--- /dev/null
+++ b/hw/riscv/opentitan.c
@@ -0,0 +1,169 @@
+/*
+ * QEMU RISC-V Board Compatible with OpenTitan FPGA platform
+ *
+ * Copyright (c) 2020 Western Digital
+ *
+ * Provides a board compatible with the OpenTitan FPGA platform:
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/riscv/opentitan.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/misc/unimp.h"
+#include "hw/riscv/boot.h"
+#include "exec/address-spaces.h"
+
+static const struct MemmapEntry {
+hwaddr base;
+hwaddr size;
+} ibex_memmap[] = {
+[IBEX_ROM] ={  0x8000,   0xc000 },
+[IBEX_RAM] ={  0x1000,  0x1 },
+[IBEX_FLASH] =  {  0x2000,  0x8 },
+[IBEX_UART] =   {  0x4000,  0x1 },
+[IBEX_GPIO] =   {  0x4001,  0x1 },
+[IBEX_SPI] ={  0x4002,  0x1 },
+[IBEX_FLASH_CTRL] = {  0x4003,  0x1 },
+[IBEX_PINMUX] = {  0x4007,  0x1 },
+[IBEX_RV_TIMER] =   {  0x4008,  0x1 },
+[IBEX_PLIC] =

[PATCH v4 00/10] RISC-V Add the OpenTitan Machine

2020-05-27 Thread Alistair Francis

OpenTitan is an open source silicon Root of Trust (RoT) project. This
series adds initial support for the OpenTitan machine to QEMU.

This series add the Ibex CPU to the QEMU RISC-V target. It then adds the
OpenTitan machine, the Ibex UART and the Ibex PLIC.

The UART has been tested sending and receiving data.

With this series QEMU can boot the OpenTitan ROM, Tock OS and a Tock
userspace app.

The Ibex PLIC is similar to the RISC-V PLIC (and is based on the QEMU
implementation) with some differences. The hope is that the Ibex PLIC
will converge to follow the RISC-V spec. As that happens I want to
update the QEMU Ibex PLIC and hopefully eventually replace the current
PLIC as the implementation is a little overlay complex.

For more details on OpenTitan, see here: https://docs.opentitan.org/

v4:
 - Don't set the reset vector in realise
 - Fix a bug where the MMU is always enabled
 - Fixup the PMP/MMU size logic
v3:
 - Small fixes pointed out in review
v2:
 - Rebase on master
 - Get uart receive working



Alistair Francis (10):
  riscv/boot: Add a missing header include
  target/riscv: Don't overwrite the reset vector
  target/riscv: Disable the MMU correctly
  target/riscv: Add the lowRISC Ibex CPU
  riscv: Initial commit of OpenTitan machine
  hw/char: Initial commit of Ibex UART
  hw/intc: Initial commit of lowRISC Ibex PLIC
  riscv/opentitan: Connect the PLIC device
  riscv/opentitan: Connect the UART device
  target/riscv: Use a smaller guess size for no-MMU PMP

 default-configs/riscv32-softmmu.mak |   1 +
 default-configs/riscv64-softmmu.mak |  11 +-
 include/hw/char/ibex_uart.h | 110 +++
 include/hw/intc/ibex_plic.h |  63 
 include/hw/riscv/boot.h |   1 +
 include/hw/riscv/opentitan.h|  79 +
 target/riscv/cpu.h  |   1 +
 hw/char/ibex_uart.c | 492 
 hw/intc/ibex_plic.c | 261 +++
 hw/riscv/opentitan.c| 204 
 target/riscv/cpu.c  |  21 +-
 target/riscv/pmp.c  |  14 +-
 MAINTAINERS |  13 +
 hw/char/Makefile.objs   |   1 +
 hw/intc/Makefile.objs   |   1 +
 hw/riscv/Kconfig|   9 +
 hw/riscv/Makefile.objs  |   1 +
 17 files changed, 1272 insertions(+), 11 deletions(-)
 create mode 100644 include/hw/char/ibex_uart.h
 create mode 100644 include/hw/intc/ibex_plic.h
 create mode 100644 include/hw/riscv/opentitan.h
 create mode 100644 hw/char/ibex_uart.c
 create mode 100644 hw/intc/ibex_plic.c
 create mode 100644 hw/riscv/opentitan.c

-- 
2.26.2

[PATCH v4 01/10] riscv/boot: Add a missing header include

2020-05-27 Thread Alistair Francis

As the functions declared in this header use the symbol_fn_t
typedef itself declared in "hw/loader.h", we need to include
it here to make the header file self-contained.

Signed-off-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bin Meng 
---
 include/hw/riscv/boot.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
index 474a940ad5..9daa98da08 100644
--- a/include/hw/riscv/boot.h
+++ b/include/hw/riscv/boot.h
@@ -21,6 +21,7 @@
 #define RISCV_BOOT_H
 
 #include "exec/cpu-defs.h"
+#include "hw/loader.h"
 
 void riscv_find_and_load_firmware(MachineState *machine,
   const char *default_machine_firmware,
-- 
2.26.2

Re: [PATCH v3 3/9] target/riscv: Add the lowRISC Ibex CPU

2020-05-27 Thread Alistair Francis

On Tue, May 26, 2020 at 6:58 PM LIU Zhiwei  wrote:
>
>
>
> On 2020/5/27 1:12, Alistair Francis wrote:
> > On Fri, May 22, 2020 at 12:51 AM LIU Zhiwei  wrote:
> >>
> >>
> >> On 2020/5/20 5:31, Alistair Francis wrote:
> >>> Ibex is a small and efficient, 32-bit, in-order RISC-V core with
> >>> a 2-stage pipeline that implements the RV32IMC instruction set
> >>> architecture.
> >>>
> >>> For more details on lowRISC see here:
> >>> https://github.com/lowRISC/ibex
> >>>
> >>> Signed-off-by: Alistair Francis 
> >>> Reviewed-by: Bin Meng 
> >>> ---
> >>>target/riscv/cpu.h |  1 +
> >>>target/riscv/cpu.c | 10 ++
> >>>2 files changed, 11 insertions(+)
> >>>
> >>> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> >>> index d0e7f5b9c5..8733d7467f 100644
> >>> --- a/target/riscv/cpu.h
> >>> +++ b/target/riscv/cpu.h
> >>> @@ -35,6 +35,7 @@
> >>>#define TYPE_RISCV_CPU_ANY  RISCV_CPU_TYPE_NAME("any")
> >>>#define TYPE_RISCV_CPU_BASE32   RISCV_CPU_TYPE_NAME("rv32")
> >>>#define TYPE_RISCV_CPU_BASE64   RISCV_CPU_TYPE_NAME("rv64")
> >>> +#define TYPE_RISCV_CPU_IBEX 
> >>> RISCV_CPU_TYPE_NAME("lowrisc-ibex")
> >>>#define TYPE_RISCV_CPU_SIFIVE_E31   
> >>> RISCV_CPU_TYPE_NAME("sifive-e31")
> >>>#define TYPE_RISCV_CPU_SIFIVE_E34   
> >>> RISCV_CPU_TYPE_NAME("sifive-e34")
> >>>#define TYPE_RISCV_CPU_SIFIVE_E51   
> >>> RISCV_CPU_TYPE_NAME("sifive-e51")
> >>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >>> index 5eb3c02735..eb2bbc87ae 100644
> >>> --- a/target/riscv/cpu.c
> >>> +++ b/target/riscv/cpu.c
> >>> @@ -156,6 +156,15 @@ static void rv32gcsu_priv1_10_0_cpu_init(Object *obj)
> >>>set_feature(env, RISCV_FEATURE_PMP);
> >>>}
> >>>
> >>> +static void rv32imcu_nommu_cpu_init(Object *obj)
> >>> +{
> >>> +CPURISCVState *env = &RISCV_CPU(obj)->env;
> >>> +set_misa(env, RV32 | RVI | RVM | RVC | RVU);
> >>> +set_priv_version(env, PRIV_VERSION_1_10_0);
> >>> +set_resetvec(env, 0x8090);
> >> Hi Alistair,
> >>
> >> I see all RISC-V cpus  have an reset vector which acts as the first pc
> >> when machine boots up.
> >> However, the first pc is more like an attribute of a machine, not a cpu.
> > In general it seems to be a CPU property. I assume that some CPUs
> > would allow the reset vector to be selectable though, in which case it
> > becomes a board property.
> >
> >> Another reason is that the cpu names are a combination of ISA.
> >> Then the cpus from different vendors may have same ISA, with different
> >> reset vectors.
> >>
> >> Do you think so?
> > If you are worried about CPUs with different vectors we could always
> > make it a property in the future and have boards override it. I don't
> > think we need that yet (only 1 CPU is different) but it is an easy
> > future change.
> I think your are right. A cpu reset vector property is better. If there
> is a conflict in the future,
> we can add the property there.
>
> Reviewed-by: LIU Zhiwei 

Great! Thanks :)

Alistair

>
> Zhiwei

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Peter Xu

On Wed, May 27, 2020 at 06:27:38PM +0200, Philippe Mathieu-Daudé wrote:
> On 5/27/20 6:16 PM, Peter Xu wrote:
> > On Wed, May 27, 2020 at 05:53:16PM +0200, Philippe Mathieu-Daudé wrote:
> > +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
> > +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu[i].type)) 
> > {
> > +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name);
>  Just wondering why you want to trace the type as you now have the name
>  string.
> >>>
> >>> You are right :)
> >>>
> > +return iommu[i].type;
> >  }
> >  }
> > +trace_vfio_get_iommu_type(-1, "Not available or not supported");
>  nit: from a debugging pov, this may be not needed as
>  vfio_get_group/vfio_connect_container() fails and this leads to an error
>  output.
> >>
> >> But you can reach this for example using No-IOMMU. If you don't mind, I
> >> find having this information in the trace log clearer.
> > 
> > I kinda agree with Eric - AFAICT QEMU vfio-pci don't work with no-iommu, 
> > then
> > it seems meaningless to trace it...
> > 
> > I'm not sure whether this trace is extremely helpful because syscalls like 
> > this
> > could be easily traced by things like strace or bpftrace as general tools 
> > (and
> > this information should be a one-time thing rather than dynamically 
> > changing),
> > no strong opinion though.  Also, if we want to dump something, maybe it's
> > better to do in vfio_init_container() after vfio_get_iommu_type() 
> > succeeded, so
> > we dump which container is enabled with what type of iommu.
> 
> OK. I'm a recent VFIO user so maybe I am not using the good information.
> 
> This trace helps me while working on a new device feature, I didn't
> thought about gathering it in a production because there I'd expect
> things to work.
> 
> Now in my case what I want is to know is if I'm using a v1 or v2 type.
> Maybe this information is already available in /proc or /sys and we
> don't need this patch...

I don't know such /proc or /sys, so maybe it's still useful. I guess Alex would
have the best judgement. The strace/bpftrace things are not really reasons I
found to nak this patch, but just something I thought first that could be
easier when any of us wants to peak at those information, probably something
just FYI. :-)

Thanks,

-- 
Peter Xu

Re: [PATCH v7 25/32] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

The L2 bitmap needs to be updated after each write to indicate what
new subclusters are now allocated. This needs to happen even if the
cluster was already allocated and the L2 entry was otherwise valid.

In some cases however a write operation doesn't need change the L2
bitmap (because all affected subclusters were already allocated). This
is detected in calculate_l2_meta(), and qcow2_alloc_cluster_link_l2()
is never called in those cases.

Signed-off-by: Alberto Garcia 
---
  block/qcow2-cluster.c | 18 ++
  1 file changed, 18 insertions(+)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 23/32] qcow2: Add subcluster support to discard_in_l2_slice()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

Two things need to be taken into account here:

1) With full_discard == true the L2 entry must be cleared completely.
This also includes the L2 bitmap if the image has extended L2
entries.

2) With full_discard == false we have to make the discarded cluster
read back as zeroes. With normal L2 entries this is done with the
QCOW_OFLAG_ZERO bit, whereas with extended L2 entries this is done
with the individual 'all zeroes' bits for each subcluster.

Note however that QCOW_OFLAG_ZERO is not supported in v2 qcow2
images so, if there is a backing file, discard cannot guarantee
that the image will read back as zeroes. If this is important for
the caller it should forbid it as qcow2_co_pdiscard() does (see
80f5c01183 for more details).

Signed-off-by: Alberto Garcia 
---
  block/qcow2-cluster.c | 52 +++
  1 file changed, 23 insertions(+), 29 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 2/7] gitlab-ci: Remove flex/bison packages

2020-05-27 Thread Alex Bennée



Thomas Huth  writes:

> From: Philippe Mathieu-Daudé 
>
> QEMU does not use flex/bison packages.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> Message-Id: <20200515163029.12917-4-phi...@redhat.com>
> Signed-off-by: Thomas Huth 

Reviewed-by: Alex Bennée 

> ---
>  .gitlab-ci.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index b889fb96b6..994774250f 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -4,7 +4,7 @@ include:
>  
>  before_script:
>   - apt-get update -qq
> - - apt-get install -y -qq flex bison libglib2.0-dev libpixman-1-dev 
> genisoimage
> + - apt-get install -y -qq libglib2.0-dev libpixman-1-dev genisoimage
>  
>  build-system1:
>   script:


-- 
Alex Bennée

Re: [PATCH v2 11/11] tests/acceptance: Linux boot test for record/replay

2020-05-27 Thread Alex Bennée



Pavel Dovgalyuk  writes:

> This patch adds a test for record/replay, which boots Linux
> image from the disk and interacts with the network.
> The idea and code of this test is borrowed from boot_linux.py
> However, currently record/replay works only for x86_64,
> therefore other tests were excluded.
>
> Each test consists of the following phases:
>  - downloading the disk image
>  - recording the execution
>  - replaying the execution
>
> Replay does not validates the output, but waits until QEMU
> finishes the execution. This is reasonable, because
> QEMU usually hangs when replay goes wrong.

Two things:

 - We need to tag these tests as slow so they aren't run by default
 - 1800s is a long timeout to wait for to know it's a problem

Looking at the log shows my test is still running? Maybe we can check
the output as we go?

>
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  0 files changed
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e9a9ce4f66..97f066a9b2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2498,6 +2498,7 @@ F: include/sysemu/replay.h
>  F: docs/replay.txt
>  F: stubs/replay.c
>  F: tests/acceptance/replay_kernel.py
> +F: tests/acceptance/replay_linux.py
>  
>  IOVA Tree
>  M: Peter Xu 
> diff --git a/tests/acceptance/replay_linux.py 
> b/tests/acceptance/replay_linux.py
> new file mode 100644
> index 00..7c5971f156
> --- /dev/null
> +++ b/tests/acceptance/replay_linux.py
> @@ -0,0 +1,97 @@
> +# Record/replay test that boots a complete Linux system via a cloud image
> +#
> +# Copyright (c) 2020 ISP RAS
> +#
> +# Author:
> +#  Pavel Dovgalyuk 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +import os
> +
> +from avocado.utils import cloudinit
> +from avocado.utils import network
> +from avocado.utils import vmimage
> +from avocado.utils import datadrainer
> +from avocado.utils.path import find_command
> +from boot_linux import BootLinuxBase
> +
> +class ReplayLinux(BootLinuxBase):
> +"""
> +Boots a Linux system, checking for a successful initialization
> +"""
> +
> +timeout = 1800
> +chksum = None
> +hdd = 'ide-hd'
> +cd = 'ide-cd'
> +bus = 'ide'
> +
> +def setUp(self):
> +super(ReplayLinux, self).setUp()
> +self.boot_path = self.download_boot()
> +self.cloudinit_path = self.download_cloudinit()
> +
> +def vm_add_disk(self, vm, path, id, device):
> +bus_string = ''
> +if self.bus:
> +bus_string = ',bus=%s.%d' % (self.bus, id,)
> +vm.add_args('-drive', 'file=%s,snapshot,id=disk%s,if=none' % (path, 
> id))
> +vm.add_args('-drive', 
> 'driver=blkreplay,id=disk%s-rr,if=none,image=disk%s' % (id, id))
> +vm.add_args('-device', '%s,drive=disk%s-rr%s' % (device, id, 
> bus_string))
> +
> +def launch_and_wait(self, record, args, shift):
> +vm = self.get_vm()
> +vm.add_args('-smp', '1')
> +vm.add_args('-m', '1024')
> +vm.add_args('-object', 'filter-replay,id=replay,netdev=hub0port0')
> +if args:
> +vm.add_args(*args)
> +self.vm_add_disk(vm, self.boot_path, 0, self.hdd)
> +self.vm_add_disk(vm, self.cloudinit_path, 1, self.cd)
> +if record:
> +mode = 'record'
> +else:
> +mode = 'replay'
> +vm.add_args('-icount', 'shift=%s,rr=%s,rrfile=%s' %
> +(shift, mode, os.path.join(self.workdir, 'replay.bin')))
> +
> +vm.set_console()
> +vm.launch()
> +console_drainer = datadrainer.LineLogger(vm.console_socket.fileno(),
> + 
> logger=self.log.getChild('console'),
> + stop_check=(lambda : not 
> vm.is_running()))
> +console_drainer.start()
> +if record:
> +self.log.info('VM launched, waiting for boot confirmation from 
> guest')
> +cloudinit.wait_for_phone_home(('0.0.0.0', self.phone_home_port), 
> self.name)
> +vm.shutdown()
> +else:
> +self.log.info('VM launched, waiting the recorded execution to be 
> replayed')
> +vm.wait()
> +
> +def run_rr(self, args=None, shift=7):
> +self.launch_and_wait(True, args, shift)
> +self.launch_and_wait(False, args, shift)
> +
> +class ReplayLinuxX8664(ReplayLinux):
> +"""
> +:avocado: tags=arch:x86_64
> +"""
> +
> +chksum = 
> 'e3c1b309d9203604922d6e255c2c5d098a309c2d46215d8fc026954f3c5c27a0'
> +
> +def test_pc_i440fx(self):
> +"""
> +:avocado: tags=machine:pc
> +:avocado: tags=accel:tcg
> +"""
> +self.run_rr(shift=1)
> +
> +def test_pc_q35(self):
> +"""
> +:avocado: tags=machine:q35
> +:avocado: tags=accel:tcg
> +"""
> +self.run_rr(shift=3)


-- 
Alex Bennée

Re: [PATCH v7 22/32] qcow2: Add subcluster support to zero_in_l2_slice()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

The QCOW_OFLAG_ZERO bit that indicates that a cluster reads as
zeroes is only used in standard L2 entries. Extended L2 entries use
individual 'all zeroes' bits for each subcluster.

This must be taken into account when updating the L2 entry and also
when deciding that an existing entry does not need to be updated.

Signed-off-by: Alberto Garcia 
---
  block/qcow2-cluster.c | 36 +++-
  1 file changed, 19 insertions(+), 17 deletions(-)


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 21/32] qcow2: Add subcluster support to qcow2_get_host_offset()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

The logic of this function remains pretty much the same, except that
it uses count_contiguous_subclusters(), which combines the logic of
count_contiguous_clusters() / count_contiguous_clusters_unallocated()
and checks individual subclusters.

qcow2_cluster_to_subcluster_type() is not necessary as a separate
function anymore so it's inlined into its caller.

Signed-off-by: Alberto Garcia 
---
  block/qcow2.h |  38 ---
  block/qcow2-cluster.c | 150 ++
  2 files changed, 92 insertions(+), 96 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 7/7] linux-user: limit check to HOST_LONG_BITS < TARGET_ABI_BITS

2020-05-27 Thread Alex Bennée



Thomas Huth  writes:

> On 27/05/2020 16.44, Laurent Vivier wrote:
>> Le 25/05/2020 à 15:18, Thomas Huth a écrit :
>>> From: Alex Bennée 
>>>
>>> Newer clangs rightly spot that you can never exceed the full address
>>> space of 64 bit hosts with:
>>>
>>>   linux-user/elfload.c:2076:41: error: result of comparison 'unsigned
>>>   long' > 18446744073709551615 is always false
>>>   [-Werror,-Wtautological-type-limit-compare]
>>>   4685 if ((guest_hiaddr - guest_base) > ~(uintptr_t)0) {
>>>   4686 ~~~ ^ ~
>>>   4687 1 error generated.
>>>
>>> So lets limit the check to 32 bit hosts only.
>>>
>>> Fixes: ee94743034bf
>>> Reported-by: Thomas Huth 
>>> Signed-off-by: Alex Bennée 
>>> [thuth: Use HOST_LONG_BITS < TARGET_ABI_BITS instead of HOST_LONG_BITS == 
>>> 32]
>>> Signed-off-by: Thomas Huth 
>>> ---
>>>  linux-user/elfload.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>>> index 01a9323a63..ebc663ea0b 100644
>>> --- a/linux-user/elfload.c
>>> +++ b/linux-user/elfload.c
>>> @@ -2073,12 +2073,14 @@ static void pgb_have_guest_base(const char 
>>> *image_name, abi_ulong guest_loaddr,
>>>  exit(EXIT_FAILURE);
>>>  }
>>>  } else {
>>> +#if HOST_LONG_BITS < TARGET_ABI_BITS
>>>  if ((guest_hiaddr - guest_base) > ~(uintptr_t)0) {
>>>  error_report("%s: requires more virtual address space "
>>>   "than the host can provide (0x%" PRIx64 ")",
>>>   image_name, (uint64_t)guest_hiaddr - guest_base);
>>>  exit(EXIT_FAILURE);
>>>  }
>>> +#endif
>>>  }
>>>  
>>>  /*
>>>
>> 
>> Philippe sent the same patch:
>> 
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg699796.html
>
> Indeed, but looking more closely, he's using slightly different
> locations for the #if and #endif ... not sure what's better though...?

Richard was more inclined to suppress the warning:

  Subject: Re: [PATCH v2] linux-user: limit check to HOST_LONG_BITS == 32
  From: Richard Henderson 
  Message-ID: <3069bc1b-115d-f361-8271-c775bf695...@linaro.org>
  Date: Thu, 21 May 2020 20:15:51 -0700

One reason I dropped the f32 patch from my last PR was because this
wasn't the only warning the latest clang picks up.

-- 
Alex Bennée

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Philippe Mathieu-Daudé

On 5/27/20 6:16 PM, Peter Xu wrote:
> On Wed, May 27, 2020 at 05:53:16PM +0200, Philippe Mathieu-Daudé wrote:
> +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
> +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu[i].type)) {
> +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name);
 Just wondering why you want to trace the type as you now have the name
 string.
>>>
>>> You are right :)
>>>
> +return iommu[i].type;
>  }
>  }
> +trace_vfio_get_iommu_type(-1, "Not available or not supported");
 nit: from a debugging pov, this may be not needed as
 vfio_get_group/vfio_connect_container() fails and this leads to an error
 output.
>>
>> But you can reach this for example using No-IOMMU. If you don't mind, I
>> find having this information in the trace log clearer.
> 
> I kinda agree with Eric - AFAICT QEMU vfio-pci don't work with no-iommu, then
> it seems meaningless to trace it...
> 
> I'm not sure whether this trace is extremely helpful because syscalls like 
> this
> could be easily traced by things like strace or bpftrace as general tools (and
> this information should be a one-time thing rather than dynamically changing),
> no strong opinion though.  Also, if we want to dump something, maybe it's
> better to do in vfio_init_container() after vfio_get_iommu_type() succeeded, 
> so
> we dump which container is enabled with what type of iommu.

OK. I'm a recent VFIO user so maybe I am not using the good information.

This trace helps me while working on a new device feature, I didn't
thought about gathering it in a production because there I'd expect
things to work.

Now in my case what I want is to know is if I'm using a v1 or v2 type.
Maybe this information is already available in /proc or /sys and we
don't need this patch...

Re: linux-user - time64 question

2020-05-27 Thread Laurent Vivier

Le 05/05/2020 à 23:38, Sid Manning a écrit :
> I’m looking at a testcase failure when my target uses 64bit time in
> msg.h (struct msqid_ds).  I’ve been able to get around this but changing
> target_msqid_ds like so:
> 
> 
> @@ -3900,18 +3901,9 @@ static inline abi_long do_semop(int semid,
> abi_long ptr,
> unsigned nsops)
>  struct target_msqid_ds
>  {
>      struct target_ipc_perm msg_perm;
> -    abi_ulong msg_stime;
> -#if TARGET_ABI_BITS == 32
> -    abi_ulong __unused1;
> -#endif
> -    abi_ulong msg_rtime;
> -#if TARGET_ABI_BITS == 32
> -    abi_ulong __unused2;
> -#endif
> -    abi_ulong msg_ctime;
> -#if TARGET_ABI_BITS == 32
> -    abi_ulong __unused3;
> -#endif
> +    abi_ullong msg_stime;
> +    abi_ullong msg_rtime;
> +    abi_ullong msg_ctime;
>      abi_ulong __msg_cbytes;
>      abi_ulong msg_qnum;
>      abi_ulong msg_qbytes;
> 
> It seems like either should have worked but I get garbage back in some
> of the elements below msg_time fields without the change.
> 
> If time_t is 64bits then it seems like stime/rtime/ctime should be
> abi_ullong.
> 
> My target is Hexagon and the TARGET_ABI_BITS is 32.

The structure has been changed into the kernel for the y2038 and the
change has not been reflected into qemu (and it needs).

See kernel commit:

c2ab975c30f0 ("y2038: ipc: Report long times to user space")

Thanks,
Laurent

Re: [PATCH v2 04/11] tests/acceptance: add kernel record/replay test for x86_64

2020-05-27 Thread Alex Bennée



Alex Bennée  writes:

> Pavel Dovgalyuk  writes:
>
>> This patch adds a test for record/replay an execution of x86_64 machine.
>> Execution scenario includes simple kernel boot, which allows testing
>> basic hardware interaction in RR mode.
>>
>> Signed-off-by: Pavel Dovgalyuk 
>> ---
>>  0 files changed
>>
>> diff --git a/tests/acceptance/replay_kernel.py 
>> b/tests/acceptance/replay_kernel.py
>> index b8b277ad2f..c7526f1aba 100644
>> --- a/tests/acceptance/replay_kernel.py
>> +++ b/tests/acceptance/replay_kernel.py
>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>  True, shift, args)
>>  self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>  False, shift, args)
>> +
>> +def test_x86_64_pc(self):
>> +"""
>> +:avocado: tags=arch:x86_64
>> +:avocado: tags=machine:pc
>> +"""
>> +kernel_url = 
>> ('https://archives.fedoraproject.org/pub/archive/fedora'
>> +  
>> '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>> +  '/vmlinuz')
>> +kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>> +kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>> +
>> +kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>> 'console=ttyS0'
>
> I note that:
>
>   KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
>
> and given we are looking for repeatability here maybe we should use our
> own command line so we can compare the recorded and replayed boot?

To build on that I think a command line like:

  KERNEL_COMMON_COMMAND_LINE = 'printk.time=1 panic=-1 '

called with --no-reboot and a pattern:

  console_pattern = 'VFS: Cannot open root device'

You will run more of the kernel (importantly with timestamps > 0.000) so
we can have a better compare between the recorded and replayed run.

-- 
Alex Bennée

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Peter Xu

On Wed, May 27, 2020 at 05:53:16PM +0200, Philippe Mathieu-Daudé wrote:
> >>> +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
> >>> +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu[i].type)) {
> >>> +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name);
> >> Just wondering why you want to trace the type as you now have the name
> >> string.
> > 
> > You are right :)
> > 
> >>> +return iommu[i].type;
> >>>  }
> >>>  }
> >>> +trace_vfio_get_iommu_type(-1, "Not available or not supported");
> >> nit: from a debugging pov, this may be not needed as
> >> vfio_get_group/vfio_connect_container() fails and this leads to an error
> >> output.
> 
> But you can reach this for example using No-IOMMU. If you don't mind, I
> find having this information in the trace log clearer.

I kinda agree with Eric - AFAICT QEMU vfio-pci don't work with no-iommu, then
it seems meaningless to trace it...

I'm not sure whether this trace is extremely helpful because syscalls like this
could be easily traced by things like strace or bpftrace as general tools (and
this information should be a one-time thing rather than dynamically changing),
no strong opinion though.  Also, if we want to dump something, maybe it's
better to do in vfio_init_container() after vfio_get_iommu_type() succeeded, so
we dump which container is enabled with what type of iommu.

Thanks,

-- 
Peter Xu

Re: [PATCH v1 2/3] linux-user: deal with address wrap for ARM_COMMPAGE on 32 bit

2020-05-27 Thread Alex Bennée



Aleksandar Markovic  writes:

> сре, 27. мај 2020. у 14:05 Aleksandar Markovic
>  је написао/ла:
>>
>> сре, 27. мај 2020. у 12:07 Alex Bennée  је 
>> написао/ла:
>> >
>> > We rely on the pointer to wrap when accessing the high address of the
>> > COMMPAGE so it lands somewhere reasonable. However on 32 bit hosts we
>> > cannot afford just to map the entire 4gb address range. The old mmap
>> > trial and error code handled this by just checking we could map both
>> > the guest_base and the computed COMMPAGE address.
>> >
>> > We can't just manipulate loadaddr to get what we want so we introduce
>> > an offset which pgb_find_hole can apply when looking for a gap for
>> > guest_base that ensures there is space left to map the COMMPAGE
>> > afterwards.
>> >
>> > This is arguably a little inefficient for the one 32 bit
>> > value (kuser_helper_version) we need to keep there given all the
>> > actual code entries are picked up during the translation phase.
>> >
>> > Fixes: ee94743034b
>> > Bug: https://bugs.launchpad.net/qemu/+bug/1880225
>>
>> For the scenario in this bug report, for today's master, before and after
>> applying this patch:
>>
>
> I am not sure how significant is this info, but I executed the test
> without applying patch 1/3, so only 2/3 was applied in the case
> "AFTER".

That is expected. The other fix only affects binaries run inside a
/proc-less chroot and the final patch is a test case for COMMPAGE
support.

-- 
Alex Bennée

Re: [PATCH 09/24] macio: Fix to realize "mos6522-cuda" and "mos6522-pmu" devices

2020-05-27 Thread Markus Armbruster

Peter Maydell  writes:

> On Wed, 27 May 2020 at 15:12, Markus Armbruster  wrote:
>> * PATCH 08: in a realize method.  Can't actually fail, so let's use
>>   &error_abort.
>>
>> * PATCH 09 (this one): likewise.
>
> I disagree with these. We're in a realize function, the API
> says "on errors, report them via the Error* you got passed",
> so we should do that, not blow up. &error_abort only makes
> sense if (a) we have no better way to report errors than
> to abort (which isn't the case here) or (b) if we can guarantee
> that in fact the thing we're doing won't ever fail

I detest impossible (and therefore untestable) error paths.

> (which we can't here without knowing more about the internal
> implementation details of the MOS6522 device than we
> really ought to).

At least the child devices are all defined in the same file.

My second line of defense: my patches are strict improvments.  If you
want further improvements, I'd prefer you propose them as patches on top
of mine.

Re: [PATCH v2] hw/vfio/common: Trace in which mode an IOMMU is opened

2020-05-27 Thread Cornelia Huck

On Wed, 27 May 2020 17:55:55 +0200
Philippe Mathieu-Daudé  wrote:

> One might want to check which IOMMU version the host kernel
> provide. Add a trace event to see in which mode we opened
> our container.
> 
> Reviewed-by: Cornelia Huck 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2: Only display string description (Eric)

Also fine with me.

> 
> Supersedes: <20200526173542.28710-1-phi...@redhat.com>
> ---
>  hw/vfio/common.c | 19 ++-
>  hw/vfio/trace-events |  1 +
>  2 files changed, 15 insertions(+), 5 deletions(-)

Re: [PATCH v7 20/32] qcow2: Add subcluster support to calculate_l2_meta()

2020-05-27 Thread Eric Blake


On 5/25/20 1:08 PM, Alberto Garcia wrote:

If an image has subclusters then there are more copy-on-write
scenarios that we need to consider. Let's say we have a write request
from the middle of subcluster #3 until the end of the cluster:

1) If we are writing to a newly allocated cluster then we need
copy-on-write. The previous contents of subclusters #0 to #3 must
be copied to the new cluster. We can optimize this process by
skipping all leading unallocated or zero subclusters (the status of
those skipped subclusters will be reflected in the new L2 bitmap).

2) If we are overwriting an existing cluster:

2.1) If subcluster #3 is unallocated or has the all-zeroes bit set
 then we need copy-on-write (on subcluster #3 only).

2.2) If subcluster #3 was already allocated then there is no need
 for any copy-on-write. However we still need to update the L2
 bitmap to reflect possible changes in the allocation status of
 subclusters #4 to #31. Because of this, this function checks
 if all the overwritten subclusters are already allocated and
 in this case it returns without creating a new QCowL2Meta
 structure.


Quite the mouthful!  But the description looks correct, and the code 
appears to match it.




After all these changes l2meta_cow_start() and l2meta_cow_end()
are not necessarily cluster-aligned anymore. We need to update the
calculation of old_start and old_end in handle_dependencies() to
guarantee that no two requests try to write on the same cluster.

Signed-off-by: Alberto Garcia 
---
  block/qcow2-cluster.c | 163 +-
  1 file changed, 131 insertions(+), 32 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH v2] hw/vfio/common: Trace in which mode an IOMMU is opened

2020-05-27 Thread Philippe Mathieu-Daudé

One might want to check which IOMMU version the host kernel
provide. Add a trace event to see in which mode we opened
our container.

Reviewed-by: Cornelia Huck 
Signed-off-by: Philippe Mathieu-Daudé 
---
v2: Only display string description (Eric)

Supersedes: <20200526173542.28710-1-phi...@redhat.com>
---
 hw/vfio/common.c | 19 ++-
 hw/vfio/trace-events |  1 +
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0b3593b3c0..f24450472e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1157,15 +1157,24 @@ static void vfio_put_address_space(VFIOAddressSpace 
*space)
 static int vfio_get_iommu_type(VFIOContainer *container,
Error **errp)
 {
-int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
-  VFIO_SPAPR_TCE_v2_IOMMU, VFIO_SPAPR_TCE_IOMMU };
+static const struct {
+int type;
+const char *name;
+} iommu[] = {
+{VFIO_TYPE1v2_IOMMU, "Type1 (v2)"},
+{VFIO_TYPE1_IOMMU, "Type1 (v1)"},
+{VFIO_SPAPR_TCE_v2_IOMMU, "sPAPR TCE (v2)"},
+{VFIO_SPAPR_TCE_IOMMU, "sPAPR TCE (v1)"}
+};
 int i;
 
-for (i = 0; i < ARRAY_SIZE(iommu_types); i++) {
-if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu_types[i])) {
-return iommu_types[i];
+for (i = 0; i < ARRAY_SIZE(iommu); i++) {
+if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu[i].type)) {
+trace_vfio_get_iommu_type(iommu[i].name);
+return iommu[i].type;
 }
 }
+trace_vfio_get_iommu_type("Not available or not supported");
 error_setg(errp, "No available IOMMU models");
 return -EINVAL;
 }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index b1ef55a33f..d3f1e48618 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -115,6 +115,7 @@ vfio_region_sparse_mmap_header(const char *name, int index, 
int nr_areas) "Devic
 vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) 
"sparse entry %d [0x%lx - 0x%lx]"
 vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t 
subtype) "%s index %d, %08x/%0x8"
 vfio_dma_unmap_overflow_workaround(void) ""
+vfio_get_iommu_type(const char *type) "IOMMU type: %s"
 
 # platform.c
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group 
#%d"
-- 
2.21.3

Re: [PATCH] hw/vfio/common: Trace in which mode a IOMMU is opened

2020-05-27 Thread Philippe Mathieu-Daudé

On 5/27/20 9:43 AM, Philippe Mathieu-Daudé wrote:
> On 5/27/20 9:08 AM, Auger Eric wrote:
>> Hi Philippe,
>>
>> On 5/26/20 7:35 PM, Philippe Mathieu-Daudé wrote:
>>> One might want to check which IOMMU version the host kernel
>>> provide. Add a trace event to see in which mode we opened
>>> our container.
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>>  hw/vfio/common.c | 19 ++-
>>>  hw/vfio/trace-events |  1 +
>>>  2 files changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index 0b3593b3c0..6b69a259c1 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1157,15 +1157,24 @@ static void vfio_put_address_space(VFIOAddressSpace 
>>> *space)
>>>  static int vfio_get_iommu_type(VFIOContainer *container,
>>> Error **errp)
>>>  {
>>> -int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
>>> -  VFIO_SPAPR_TCE_v2_IOMMU, VFIO_SPAPR_TCE_IOMMU };
>>> +static const struct {
>>> +int type;
>>> +const char *name;
>>> +} iommu[] = {
>>> +{VFIO_TYPE1v2_IOMMU, "Type1 (v2)"},
>>> +{VFIO_TYPE1_IOMMU, "Type1 (v1)"},
>>> +{VFIO_SPAPR_TCE_v2_IOMMU, "sPAPR TCE (v2)"},
>>> +{VFIO_SPAPR_TCE_IOMMU, "sPAPR TCE (v1)"}
>>> +};
>>>  int i;
>>>  
>>> -for (i = 0; i < ARRAY_SIZE(iommu_types); i++) {
>>> -if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu_types[i])) {
>>> -return iommu_types[i];
>>> +for (i = 0; i < ARRAY_SIZE(iommu); i++) {
>>> +if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu[i].type)) {
>>> +trace_vfio_get_iommu_type(iommu[i].type, iommu[i].name);
>> Just wondering why you want to trace the type as you now have the name
>> string.
> 
> You are right :)
> 
>>> +return iommu[i].type;
>>>  }
>>>  }
>>> +trace_vfio_get_iommu_type(-1, "Not available or not supported");
>> nit: from a debugging pov, this may be not needed as
>> vfio_get_group/vfio_connect_container() fails and this leads to an error
>> output.

But you can reach this for example using No-IOMMU. If you don't mind, I
find having this information in the trace log clearer.

> 
> Indeed.
> 
> Thanks for the review!
> 
>>
>> Thanks
>>
>> Eric
>>>  error_setg(errp, "No available IOMMU models");
>>>  return -EINVAL;
>>>  }
>>> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
>>> index b1ef55a33f..8166c4c50d 100644
>>> --- a/hw/vfio/trace-events
>>> +++ b/hw/vfio/trace-events
>>> @@ -115,6 +115,7 @@ vfio_region_sparse_mmap_header(const char *name, int 
>>> index, int nr_areas) "Devic
>>>  vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long 
>>> end) "sparse entry %d [0x%lx - 0x%lx]"
>>>  vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t 
>>> subtype) "%s index %d, %08x/%0x8"
>>>  vfio_dma_unmap_overflow_workaround(void) ""
>>> +vfio_get_iommu_type(int iommu_type, const char *iommu_name) "IOMMU type %d 
>>> (%s)"
>>>  
>>>  # platform.c
>>>  vfio_platform_base_device_init(char *name, int groupid) "%s belongs to 
>>> group #%d"
>>>
>>
>

Re: [PATCH v2 7/7] virtio-scsi: use scsi_device_get

2020-05-27 Thread Stefan Hajnoczi

On Mon, May 11, 2020 at 07:09:51PM +0300, Maxim Levitsky wrote:
> This will help us to avoid the scsi device disappearing
> after we took a reference to it.
> 
> It doesn't by itself forbid case when we try to access
> an unrealized device
> 
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Maxim Levitsky 
> ---
>  hw/scsi/virtio-scsi.c | 23 +++
>  1 file changed, 15 insertions(+), 8 deletions(-)

I'm not very familiar with the SCSI emulation code, but this looks
correct. My understanding of what this patch does:

This patch keeps SCSIDevice alive between scsi_device_find() and
scsi_req_new(). Previously no SCSIDevice ref was taken so the device
could have been freed before scsi_req_new() had a chance to take a ref.

The TMF case is similar: the SCSIDevice ref must be held during
virtio_scsi_do_tmf(). We don't need to worry about the async cancel
notifiers because the request being canceled already holds a ref.

signature.asc
Description: PGP signature

1 2 3 >

1 - 100 of 279 matches

Mail list logo