Re: [PULL 0/5] Python patches

2021-11-17 Thread Richard Henderson

On 11/17/21 1:33 AM, John Snow wrote:

The following changes since commit 2b22e7540d6ab4efe82d442363e3fc900cea6584:

   Merge tag 'm68k-for-6.2-pull-request' of git://github.com/vivier/qemu-m68k 
into staging (2021-11-09 13:16:56 +0100)

are available in the Git repository at:

   https://gitlab.com/jsnow/qemu.git tags/python-pull-request

for you to fetch changes up to c398a241ec4138e0b995a0215dea84ca93b0384f:

   scripts/device-crash-test: hide tracebacks for QMP connect errors 
(2021-11-16 14:26:36 -0500)


Pull request



John Snow (5):
   python/aqmp: Fix disconnect during capabilities negotiation
   python/aqmp: fix ConnectError string method
   scripts/device-crash-test: simplify Exception handling
   scripts/device-crash-test: don't emit AQMP connection errors to stdout
   scripts/device-crash-test: hide tracebacks for QMP connect errors

  python/qemu/aqmp/protocol.py | 24 ++--
  scripts/device-crash-test| 33 +
  2 files changed, 43 insertions(+), 14 deletions(-)


Applied, thanks.

r~




Re: [PULL 0/5] Python patches

2021-11-17 Thread Gerd Hoffmann
  Hi,

>   https://gitlab.com/jsnow/qemu.git tags/python-pull-request

What is the status of the plan to upload this to pypi eventually?

thanks,
  Gerd




Failing QEMU iotests

2021-11-17 Thread Thomas Huth



 Hi!

I think it has been working fine for me a couple of weeks ago,
but when I now run:

 make check SPEED=slow

I'm getting a couple of failing iotests... not sure whether
these are known issues already, so I thought I'd summarize them
here:

*** First one is 045 in raw mode: ***

 TEST   iotest-raw: 045 [fail]
QEMU  -- 
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64" 
-nodefaults -display none -accel qtest
QEMU_IMG  -- "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
QEMU_IO   -- "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" 
--cache writeback --aio threads -f raw
QEMU_NBD  -- "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
IMGFMT-- raw
IMGPROTO  -- file
PLATFORM  -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
TEST_DIR  -- /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR  -- /tmp/tmphlexdrlt
GDB_OPTIONS   --
VALGRIND_QEMU --
PRINT_QEMU_OUTPUT --

--- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
+++ 045.out.bad
@@ -1,5 +1,77 @@
-...
+..EE.EE
+==
+ERROR: test_add_fd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in 
test_add_fd
+self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM
+ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 229, in 
send_fd_scm
+self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, in 
send_fd_scm
+self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 149, in 
_wrapper
+return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 644, in 
send_fd_scm
+sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_closefd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 165, in 
test_closefd
+self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM
+ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 229, in 
send_fd_scm
+self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, in 
send_fd_scm
+self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 149, in 
_wrapper
+return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 644, in 
send_fd_scm
+sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_getfd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 153, in test_getfd
+self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM
+ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 229, in 
send_fd_scm
+self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, in 
send_fd_scm
+self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 149, in 
_wrapper
+return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 644, in 
send_fd_scm
+sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_getfd_invalid_fdname (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 158, in 
test_getfd_invalid_fdname
+self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM
+ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 229, in 
send_fd_scm
+self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, in 
send_fd_scm
+self._aqmp.send_fd_scm(fd)
+  File "/home/

Re: [PATCH 1/2] block:hdev: support BLKSECDISCARD

2021-11-17 Thread Stefan Hajnoczi
On Tue, Nov 16, 2021 at 09:53:39PM -0800, Christoph Hellwig wrote:
> On Tue, Nov 16, 2021 at 10:58:30AM +, Stefan Hajnoczi wrote:
> > Question for Jens and Christoph:
> > 
> > Is there a way for userspace to detect whether a Linux block device
> > supports SECDISCARD?
> 
> I don't know of one.
> 
> > If not, then maybe a new sysfs attribute can be added:
> 
> This looks correct, but if we start looking into SECDISCARD seriously
> I'd like to split the max_sectors settings for it from discard as that
> is currently a bit of a mess.  If we then expose the secure erase max
> sectors in sysfs that would also be a good indicator.

That is probably better because userspace would the queue limits
information too. Thanks!

> What is the use case for exposing secure erase in qemu?  The whole
> concept for a LBA based secure erase is generally not a very smart
> idea for flash based media..

Yadong, please see Christoph's question above.

Stefan


signature.asc
Description: PGP signature


Re: Failing QEMU iotests

2021-11-17 Thread Hanna Reitz

On 17.11.21 11:07, Thomas Huth wrote:


 Hi!

I think it has been working fine for me a couple of weeks ago,
but when I now run:

 make check SPEED=slow

I'm getting a couple of failing iotests... not sure whether
these are known issues already, so I thought I'd summarize them
here:


Thanks!


*** First one is 045 in raw mode: ***

 TEST   iotest-raw: 045 [fail]
QEMU  -- 
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64" 
-nodefaults -display none -accel qtest
QEMU_IMG  -- 
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
QEMU_IO   -- 
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" --cache 
writeback --aio threads -f raw
QEMU_NBD  -- 
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"

IMGFMT    -- raw
IMGPROTO  -- file
PLATFORM  -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
TEST_DIR  -- /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR  -- /tmp/tmphlexdrlt
GDB_OPTIONS   --
VALGRIND_QEMU --
PRINT_QEMU_OUTPUT --

--- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
+++ 045.out.bad
@@ -1,5 +1,77 @@
-...
+..EE.EE
+==
+ERROR: test_add_fd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in 
test_add_fd

+    self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM

+    ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 
229, in send_fd_scm

+    self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, 
in send_fd_scm

+    self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 
149, in _wrapper

+    return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 
644, in send_fd_scm

+    sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_closefd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 165, in 
test_closefd

+    self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM

+    ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 
229, in send_fd_scm

+    self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, 
in send_fd_scm

+    self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 
149, in _wrapper

+    return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 
644, in send_fd_scm

+    sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_getfd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 153, in 
test_getfd

+    self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM

+    ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 
229, in send_fd_scm

+    self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, 
in send_fd_scm

+    self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 
149, in _wrapper

+    return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 
644, in send_fd_scm

+    sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'
+
+==
+ERROR: test_getfd_invalid_fdname (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 158, in 
test_getfd_invalid_fdname

+    self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in 
_send_fd_by_SCM

+    ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 
229, in send_fd_scm

+    self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py"

Re: [PATCH v2 08/15] hw/nvme: Implement the Function Level Reset

2021-11-17 Thread Łukasz Gieryk
On Tue, Nov 16, 2021 at 01:28:19PM -0800, Keith Busch wrote:
> On Tue, Nov 16, 2021 at 04:34:39PM +0100, Łukasz Gieryk wrote:
> >  if (!pci_is_vf(&n->parent_obj) && n->params.sriov_max_vfs) {
> > -pcie_sriov_pf_disable_vfs(&n->parent_obj);
> > +if (rst != NVME_RESET_CONTROLLER) {
> > +pcie_sriov_pf_disable_vfs(&n->parent_obj);
> 
> Shouldn't this be 'if (rst == NVME_RESET_FUNCTION)'?

The NVMe Spec lists five possible reset types (triggers). According
to my understanding, only the Controller Reset doesn’t affect the VFs'
state, hence the '!='.




Re: [PATCH v4 20/25] block_int-common.h: assertion in the callers of BlockDriver function pointers

2021-11-17 Thread Emanuele Giuseppe Esposito




On 15/11/2021 13:48, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  block.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 94bff5c757..40c4729b8d 100644
--- a/block.c
+++ b/block.c


[...]

@@ -2148,6 +2152,7 @@ static void bdrv_child_perm(BlockDriverState 
*bs, BlockDriverState *child_bs,

  uint64_t *nperm, uint64_t *nshared)
  {
  assert(bs->drv && bs->drv->bdrv_child_perm);
+    assert(qemu_in_main_thread());
  bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
   parent_perm, parent_shared,
   nperm, nshared);


(Should’ve noticed earlier, but only did now...)

First, this function is indirectly called by bdrv_refresh_perms(). I 
understand that all perm-related functions are classified as GS.


However, bdrv_co_invalidate_cache() invokes bdrv_refresh_perms. Being 
declared in block/coroutine.h, it’s an I/O function, so it mustn’t call 
such a GS function. BlockDriver.bdrv_co_invalidate_cache(), 
bdrv_invalidate_cache(), and blk_invalidate_cache() are also classified 
as I/O functions. Perhaps all of these functions should be classified as 
GS functions?  I believe their callers and their purpose would allow for 
this.


I think that the *_invalidate_cache functions are I/O.
First of all, test-block-iothread.c calls bdrv_invalidate_cache in 
test_sync_op_invalidate_cache, which is purposefully called in an 
iothread. So that hints that we want it as I/O.
(Small mistake I just noticed: blk_invalidate_cache has the BQL 
assertion even though it is rightly put in block-backend-io.h




Second, it’s called by bdrv_child_refresh_perms(), which is called by 
block_crypto_amend_options_generic_luks().  This function is called by 
block_crypto_co_amend_luks(), which is a BlockDriver.bdrv_co_amend 
implementation, which is classified as an I/O function.


Honestly, I don’t know how to fix that mess.  The best would be if we 
could make the perm functions thread-safe and classify them as I/O, but 
it seems to me like that’s impossible (I sure hope I’m wrong).  On the 
other hand, .bdrv_co_amend very much strikes me like a GS function, but 
it isn’t.  I’m afraid it must work on nodes that are not in the main 
context, and it launches a job, so AFAIU we absolutely cannot run it 
under the BQL.


It almost seems to me like we’d need a thread-safe variant of the perm 
functions that’s allowed to fail when it cannot guarantee thread safety 
or something.  Or perhaps I’m wrong and the perm functions can actually 
be classified as thread-safe and I/O, that’d be great…


I think that since we are currently only splitting and not taking care 
of the actual I/O thread safety, we can move the _perm functions in I/O, 
and add a nice TODO to double check their thread safety.


I mean, if they are not thread-safe after the split it means they are 
not thread safe also right now.


Emanuele




[PATCH-for-6.2 v2 2/2] hw/nvme/ctrl: Prevent buffer overrun in nvme_error_info()

2021-11-17 Thread Philippe Mathieu-Daudé
Both 'buf_len' and 'off' arguments are under guest control.
Since nvme_c2h() doesn't check out of boundary access, the
caller must check for eventual buffer overrun on 'trans_len'.

Cc: qemu-sta...@nongnu.org
Fixes: 94a7897c41d ("add support for the get log page command")
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/nvme/ctrl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 93a24464647..7414f3b4dd1 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4146,7 +4146,8 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, 
uint32_t buf_len,
 uint32_t trans_len;
 NvmeErrorLog errlog;
 
-if (off >= sizeof(errlog)) {
+trans_len = MIN(sizeof(errlog) - off, buf_len);
+if (trans_len >= sizeof(errlog)) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
@@ -4155,7 +4156,6 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, 
uint32_t buf_len,
 }
 
 memset(&errlog, 0x0, sizeof(errlog));
-trans_len = MIN(sizeof(errlog) - off, buf_len);
 
 return nvme_c2h(n, (uint8_t *)&errlog, trans_len, req);
 }
-- 
2.31.1




[PATCH-for-6.2 v2 1/2] hw/nvme/ctrl: Fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)

2021-11-17 Thread Philippe Mathieu-Daudé
Both 'buf_len' and 'off' arguments are under guest control.
Since nvme_c2h() doesn't check out of boundary access, the
caller must check for eventual buffer overrun on 'trans_len'.

Cc: qemu-sta...@nongnu.org
Reported-by: Qiuhao Li 
Fixes: f432fdfa121 ("support changed namespace asynchronous event")
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/nvme/ctrl.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 6a571d18cfa..93a24464647 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4168,8 +4168,11 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t 
rae, uint32_t buf_len,
 int i = 0;
 uint32_t nsid;
 
-memset(nslist, 0x0, sizeof(nslist));
 trans_len = MIN(sizeof(nslist) - off, buf_len);
+if (trans_len >= sizeof(nslist)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+memset(nslist, 0x0, sizeof(nslist));
 
 while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) !=
 NVME_CHANGED_NSID_SIZE) {
-- 
2.31.1




[PATCH-for-6.2 v2 0/2] hw/nvme/ctrl: Fix buffer overrun (CVE-2021-3947)

2021-11-17 Thread Philippe Mathieu-Daudé
Since v1:
- Do not add more buffer overflows in modify nvme_smart_info(),
  nvme_fw_log_info() and nvme_cmd_effects() (Klaus)
- Split nvme_error_info() change in another patch

Philippe Mathieu-Daudé (2):
  hw/nvme/ctrl: Fix buffer overrun in nvme_changed_nslist
(CVE-2021-3947)
  hw/nvme/ctrl: Prevent buffer overrun in nvme_error_info()

 hw/nvme/ctrl.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

-- 
2.31.1




Re: Failing QEMU iotests

2021-11-17 Thread Thomas Huth

On 17/11/2021 11.59, Hanna Reitz wrote:

On 17.11.21 11:07, Thomas Huth wrote:


 Hi!

I think it has been working fine for me a couple of weeks ago,
but when I now run:

 make check SPEED=slow

I'm getting a couple of failing iotests... not sure whether
these are known issues already, so I thought I'd summarize them
here:

...

--- /home/thuth/devel/qemu/tests/qemu-iotests/206.out
+++ 206.out.bad
@@ -99,55 +99,19 @@

 {"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"driver": "qcow2", "encrypt": {"cipher-alg": "twofish-128", 
"cipher-mode": "ctr", "format": "luks", "hash-alg": "sha1", "iter-time": 
10, "ivgen-alg": "plain64", "ivgen-hash-alg": "md5", "key-secret": 
"keysec0"}, "file": {"driver": "file", "filename": 
"TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}

 {"return": {}}
+Job failed: Unsupported cipher algorithm twofish-128 with ctr mode
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}

 image: TEST_IMG
 file format: IMGFMT
 virtual size: 32 MiB (33554432 bytes)
-encrypted: yes
 cluster_size: 65536
 Format specific information:
 compat: 1.1
 compression type: zlib
 lazy refcounts: false
 refcount bits: 16
-    encrypt:
-    ivgen alg: plain64
-    hash alg: sha1
-    cipher alg: twofish-128
-    uuid: ----
-    format: luks
-    cipher mode: ctr
-    slots:
-    [0]:
-    active: true
-    iters: XXX
-    key offset: 4096
-    stripes: 4000
-    [1]:
-    active: false
-    key offset: 69632
-    [2]:
-    active: false
-    key offset: 135168
-    [3]:
-    active: false
-    key offset: 200704
-    [4]:
-    active: false
-    key offset: 266240
-    [5]:
-    active: false
-    key offset: 331776
-    [6]:
-    active: false
-    key offset: 397312
-    [7]:
-    active: false
-    key offset: 462848
-    payload offset: 528384
-    master key iters: XXX
 corrupt: false
 extended l2: false


I doubt this worked a couple of weeks ago, but it’s definitely one that we 
should just get around to fixing. :/


Hm, maybe I've did the successful run on a different system last time ... I 
even slightly remember now having seen this before in the past on my current 
system, so yes, likely not something new.




+++ 297.out.bad
@@ -1,2 +1,21 @@
 === pylint ===
+* Module image-fleecing
+tests/image-fleecing:34:24: C0326: Exactly one space required after comma
+patterns = [('0x5d', '0', '64k'),
+    ^ (bad-whitespace)
+tests/image-fleecing:35:25: C0326: Exactly one space required after comma
+    ('0xd5', '1M',    '64k'),
+ ^ (bad-whitespace)
+tests/image-fleecing:36:26: C0326: Exactly one space required after comma
+    ('0xdc', '32M',   '64k'),
+  ^ (bad-whitespace)
+tests/image-fleecing:39:25: C0326: Exactly one space required after comma
+overwrite = [('0xab', '0', '64k'), # Full overwrite
+ ^ (bad-whitespace)
+tests/image-fleecing:48:32: C0326: Exactly one space required after comma
+remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left [1]
+    ^ (bad-whitespace)
+tests/image-fleecing:49:27: C0326: Exactly one space required after comma
+ ('0xdc', '32M',   '32k'), # Left-end of partial-right [2]
+   ^ (bad-whitespace)


This could be because your pylint is too old.  At least for the python/ 
tests we at least require 2.8.0 
(https://lists.nongnu.org/archive/html/qemu-block/2021-10/msg00768.html) and 
bad-whitespace was removed in 2.6.


Thanks, updating pylint fixed this problem, indeed!

But maybe the iotests should check the pylint version before using it?

 Thomas




Re: [PATCH v4 20/25] block_int-common.h: assertion in the callers of BlockDriver function pointers

2021-11-17 Thread Hanna Reitz

On 17.11.21 12:33, Emanuele Giuseppe Esposito wrote:



On 15/11/2021 13:48, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  block.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 94bff5c757..40c4729b8d 100644
--- a/block.c
+++ b/block.c


[...]

@@ -2148,6 +2152,7 @@ static void bdrv_child_perm(BlockDriverState 
*bs, BlockDriverState *child_bs,

  uint64_t *nperm, uint64_t *nshared)
  {
  assert(bs->drv && bs->drv->bdrv_child_perm);
+    assert(qemu_in_main_thread());
  bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
   parent_perm, parent_shared,
   nperm, nshared);


(Should’ve noticed earlier, but only did now...)

First, this function is indirectly called by bdrv_refresh_perms(). I 
understand that all perm-related functions are classified as GS.


However, bdrv_co_invalidate_cache() invokes bdrv_refresh_perms. Being 
declared in block/coroutine.h, it’s an I/O function, so it mustn’t 
call such a GS function. BlockDriver.bdrv_co_invalidate_cache(), 
bdrv_invalidate_cache(), and blk_invalidate_cache() are also 
classified as I/O functions. Perhaps all of these functions should be 
classified as GS functions?  I believe their callers and their 
purpose would allow for this.


I think that the *_invalidate_cache functions are I/O.
First of all, test-block-iothread.c calls bdrv_invalidate_cache in 
test_sync_op_invalidate_cache, which is purposefully called in an 
iothread. So that hints that we want it as I/O.


Hm, OK, but bdrv_co_invalidate_cache() calls bdrv_refresh_perms(), which 
is a GS function, so that shouldn’t work, right?


(Small mistake I just noticed: blk_invalidate_cache has the BQL 
assertion even though it is rightly put in block-backend-io.h




Second, it’s called by bdrv_child_refresh_perms(), which is called by 
block_crypto_amend_options_generic_luks().  This function is called 
by block_crypto_co_amend_luks(), which is a BlockDriver.bdrv_co_amend 
implementation, which is classified as an I/O function.


Honestly, I don’t know how to fix that mess.  The best would be if we 
could make the perm functions thread-safe and classify them as I/O, 
but it seems to me like that’s impossible (I sure hope I’m wrong).  
On the other hand, .bdrv_co_amend very much strikes me like a GS 
function, but it isn’t.  I’m afraid it must work on nodes that are 
not in the main context, and it launches a job, so AFAIU we 
absolutely cannot run it under the BQL.


It almost seems to me like we’d need a thread-safe variant of the 
perm functions that’s allowed to fail when it cannot guarantee thread 
safety or something.  Or perhaps I’m wrong and the perm functions can 
actually be classified as thread-safe and I/O, that’d be great…


I think that since we are currently only splitting and not taking care 
of the actual I/O thread safety, we can move the _perm functions in 
I/O, and add a nice TODO to double check their thread safety.


:/

I would really, really like to avoid that unless it’s clear that we can 
make them thread-safe, or that there’s a way to take the BQL in I/O 
functions to call GS functions.  But the latter still wouldn’t make the 
perm functions I/O functions.  At most, I’d sort them under common 
functions.


I mean, if they are not thread-safe after the split it means they are 
not thread safe also right now.


Yes, sorry I wasn’t clear, I think there’s a pre-existing problem that 
your series only unveils.  I don’t know whether it has implications in 
practice yet.


Hanna




Re: [PATCH-for-6.2 v2 1/2] hw/nvme/ctrl: Fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)

2021-11-17 Thread Klaus Jensen
On Nov 17 13:35, Philippe Mathieu-Daudé wrote:
> Both 'buf_len' and 'off' arguments are under guest control.
> Since nvme_c2h() doesn't check out of boundary access, the
> caller must check for eventual buffer overrun on 'trans_len'.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Qiuhao Li 
> Fixes: f432fdfa121 ("support changed namespace asynchronous event")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/nvme/ctrl.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 6a571d18cfa..93a24464647 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -4168,8 +4168,11 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, 
> uint8_t rae, uint32_t buf_len,
>  int i = 0;
>  uint32_t nsid;
>  
> -memset(nslist, 0x0, sizeof(nslist));
>  trans_len = MIN(sizeof(nslist) - off, buf_len);
> +if (trans_len >= sizeof(nslist)) {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +memset(nslist, 0x0, sizeof(nslist));
>  
>  while ((nsid = find_first_bit(n->changed_nsids, NVME_CHANGED_NSID_SIZE)) 
> !=
>  NVME_CHANGED_NSID_SIZE) {

The issue I mentioned with off > sizeof(nslist). I'll send a fix that
just deals with it like all the other log pages.

There is probably a better way to do these checks, but I'm not sure how
right now.


signature.asc
Description: PGP signature


Re: [PATCH-for-6.2 v2 2/2] hw/nvme/ctrl: Prevent buffer overrun in nvme_error_info()

2021-11-17 Thread Klaus Jensen
On Nov 17 13:35, Philippe Mathieu-Daudé wrote:
> Both 'buf_len' and 'off' arguments are under guest control.
> Since nvme_c2h() doesn't check out of boundary access, the
> caller must check for eventual buffer overrun on 'trans_len'.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 94a7897c41d ("add support for the get log page command")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/nvme/ctrl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 93a24464647..7414f3b4dd1 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -4146,7 +4146,8 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t 
> rae, uint32_t buf_len,
>  uint32_t trans_len;
>  NvmeErrorLog errlog;
>  
> -if (off >= sizeof(errlog)) {
> +trans_len = MIN(sizeof(errlog) - off, buf_len);
> +if (trans_len >= sizeof(errlog)) {
>  return NVME_INVALID_FIELD | NVME_DNR;
>  }
>  
> @@ -4155,7 +4156,6 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t 
> rae, uint32_t buf_len,
>  }
>  
>  memset(&errlog, 0x0, sizeof(errlog));
> -trans_len = MIN(sizeof(errlog) - off, buf_len);
>  
>  return nvme_c2h(n, (uint8_t *)&errlog, trans_len, req);
>  }
> -- 
> 2.31.1
> 

I don't see any buffer overrun issue prior to this patch. However, there
is a functional bug since the offset is not added in the c2h.


signature.asc
Description: PGP signature


Re: [PATCH v4 20/25] block_int-common.h: assertion in the callers of BlockDriver function pointers

2021-11-17 Thread Emanuele Giuseppe Esposito




On 17/11/2021 13:51, Hanna Reitz wrote:

On 17.11.21 12:33, Emanuele Giuseppe Esposito wrote:



On 15/11/2021 13:48, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  block.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 94bff5c757..40c4729b8d 100644
--- a/block.c
+++ b/block.c


[...]

@@ -2148,6 +2152,7 @@ static void bdrv_child_perm(BlockDriverState 
*bs, BlockDriverState *child_bs,

  uint64_t *nperm, uint64_t *nshared)
  {
  assert(bs->drv && bs->drv->bdrv_child_perm);
+    assert(qemu_in_main_thread());
  bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
   parent_perm, parent_shared,
   nperm, nshared);


(Should’ve noticed earlier, but only did now...)

First, this function is indirectly called by bdrv_refresh_perms(). I 
understand that all perm-related functions are classified as GS.


However, bdrv_co_invalidate_cache() invokes bdrv_refresh_perms. Being 
declared in block/coroutine.h, it’s an I/O function, so it mustn’t 
call such a GS function. BlockDriver.bdrv_co_invalidate_cache(), 
bdrv_invalidate_cache(), and blk_invalidate_cache() are also 
classified as I/O functions. Perhaps all of these functions should be 
classified as GS functions?  I believe their callers and their 
purpose would allow for this.


I think that the *_invalidate_cache functions are I/O.
First of all, test-block-iothread.c calls bdrv_invalidate_cache in 
test_sync_op_invalidate_cache, which is purposefully called in an 
iothread. So that hints that we want it as I/O.


Hm, OK, but bdrv_co_invalidate_cache() calls bdrv_refresh_perms(), which 
is a GS function, so that shouldn’t work, right?


Ok let's take a step back for one moment: can you tell me why the perm 
functions should be GS?


On one side I see they are also used by I/O, as we can see above. On the 
other side, I kinda see that permission should only be modified under 
BQL. But I don't have any valid point to sustain that.
So I wonder if you have any specific and more valid reason to put them 
as GS.


Maybe clarifying this will help finding a clean solution to this problem.



(Small mistake I just noticed: blk_invalidate_cache has the BQL 
assertion even though it is rightly put in block-backend-io.h




Second, it’s called by bdrv_child_refresh_perms(), which is called by 
block_crypto_amend_options_generic_luks().  This function is called 
by block_crypto_co_amend_luks(), which is a BlockDriver.bdrv_co_amend 
implementation, which is classified as an I/O function.


Honestly, I don’t know how to fix that mess.  The best would be if we 
could make the perm functions thread-safe and classify them as I/O, 
but it seems to me like that’s impossible (I sure hope I’m wrong). On 
the other hand, .bdrv_co_amend very much strikes me like a GS 
function, but it isn’t.  I’m afraid it must work on nodes that are 
not in the main context, and it launches a job, so AFAIU we 
absolutely cannot run it under the BQL.


It almost seems to me like we’d need a thread-safe variant of the 
perm functions that’s allowed to fail when it cannot guarantee thread 
safety or something.  Or perhaps I’m wrong and the perm functions can 
actually be classified as thread-safe and I/O, that’d be great…


I think that since we are currently only splitting and not taking care 
of the actual I/O thread safety, we can move the _perm functions in 
I/O, and add a nice TODO to double check their thread safety.


:/

I would really, really like to avoid that unless it’s clear that we can 
make them thread-safe, or that there’s a way to take the BQL in I/O 
functions to call GS functions.  But the latter still wouldn’t make the 
perm functions I/O functions.  At most, I’d sort them under common 
functions.


I mean, if they are not thread-safe after the split it means they are 
not thread safe also right now.


Yes, sorry I wasn’t clear, I think there’s a pre-existing problem that 
your series only unveils.  I don’t know whether it has implications in 
practice yet.


Hanna






Re: [PATCH v4 20/25] block_int-common.h: assertion in the callers of BlockDriver function pointers

2021-11-17 Thread Hanna Reitz

On 17.11.21 14:09, Emanuele Giuseppe Esposito wrote:



On 17/11/2021 13:51, Hanna Reitz wrote:

On 17.11.21 12:33, Emanuele Giuseppe Esposito wrote:



On 15/11/2021 13:48, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  block.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 94bff5c757..40c4729b8d 100644
--- a/block.c
+++ b/block.c


[...]

@@ -2148,6 +2152,7 @@ static void bdrv_child_perm(BlockDriverState 
*bs, BlockDriverState *child_bs,

  uint64_t *nperm, uint64_t *nshared)
  {
  assert(bs->drv && bs->drv->bdrv_child_perm);
+    assert(qemu_in_main_thread());
  bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
   parent_perm, parent_shared,
   nperm, nshared);


(Should’ve noticed earlier, but only did now...)

First, this function is indirectly called by bdrv_refresh_perms(). 
I understand that all perm-related functions are classified as GS.


However, bdrv_co_invalidate_cache() invokes bdrv_refresh_perms. 
Being declared in block/coroutine.h, it’s an I/O function, so it 
mustn’t call such a GS function. 
BlockDriver.bdrv_co_invalidate_cache(), bdrv_invalidate_cache(), 
and blk_invalidate_cache() are also classified as I/O functions. 
Perhaps all of these functions should be classified as GS 
functions?  I believe their callers and their purpose would allow 
for this.


I think that the *_invalidate_cache functions are I/O.
First of all, test-block-iothread.c calls bdrv_invalidate_cache in 
test_sync_op_invalidate_cache, which is purposefully called in an 
iothread. So that hints that we want it as I/O.


Hm, OK, but bdrv_co_invalidate_cache() calls bdrv_refresh_perms(), 
which is a GS function, so that shouldn’t work, right?


Ok let's take a step back for one moment: can you tell me why the perm 
functions should be GS?


On one side I see they are also used by I/O, as we can see above. On 
the other side, I kinda see that permission should only be modified 
under BQL. But I don't have any valid point to sustain that.
So I wonder if you have any specific and more valid reason to put them 
as GS.


First I believe permissions to be part of the block graph state, and so 
global state.  But, well, that could be declared just a hunch.


Second permissions have transaction mechanisms – you try to update them 
on every node, if one fails, all are aborted, else all are committed.  
So this is by no means an atomic operation but quite drawn out.


The problem with this is that I/O operations rely on permissions, e.g. 
you’ll get assertion failures when trying to write but don’t have the 
WRITE permission.  So it definitely doesn’t seem like something to me 
that can be thread-safe in the sense of cooperating nicely with other 
I/O threads.


Perhaps it’d be fine to do permission updates while the relevant 
subgraph is drained (i.e. blocking all other I/O threads), but I kind of 
feel like the same could be said for all (other) GS operations.  Like, 
you could probably do all kinds of graph changes while all involved 
subgraphs are drained.


Hanna




[PATCH for-6.2] hw/nvme: fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)

2021-11-17 Thread Klaus Jensen
From: Klaus Jensen 

Fix missing offset verification.

Cc: qemu-sta...@nongnu.org
Cc: Philippe Mathieu-Daudé 
Reported-by: Qiuhao Li 
Fixes: f432fdfa121 ("support changed namespace asynchronous event")
Signed-off-by: Klaus Jensen 
---

Note: Since its so easy to mess this fix up, the log pages code could
probably use a refactor - there is a lot of duplicated code as well and
it's easy to miss a check like this. However, defer that for 7.0.

 hw/nvme/ctrl.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 6a571d18cfae..5f573c417b3d 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4168,6 +4168,11 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t 
rae, uint32_t buf_len,
 int i = 0;
 uint32_t nsid;
 
+if (off >= sizeof(nslist)) {
+trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(nslist));
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
 memset(nslist, 0x0, sizeof(nslist));
 trans_len = MIN(sizeof(nslist) - off, buf_len);
 
-- 
2.34.0




Re: [PATCH v4 24/25] job.h: split function pointers in JobDriver

2021-11-17 Thread Emanuele Giuseppe Esposito




On 15/11/2021 16:11, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

The job API will be handled separately in another serie.

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  include/qemu/job.h | 16 
  1 file changed, 16 insertions(+)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 6e67b6977f..7e9e59f4b8 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -169,12 +169,21 @@ typedef struct Job {
   * Callbacks and other information about a Job driver.
   */
  struct JobDriver {
+
+    /* Fields initialized in struct definition and never changed. */


Like in patch 19, I’d prefer a slightly more verbose comment that I’d 
find more easily readable.



+
  /** Derived Job struct size */
  size_t instance_size;
  /** Enum describing the operation */
  JobType job_type;
+    /*
+ * Functions run without regard to the BQL and may run in any


s/and/that/?


+ * arbitrary thread. These functions do not need to be thread-safe
+ * because the caller ensures that are invoked from one thread at 
time.


s/that/they/ (or “that they”)

I believe .run() must be run in the job’s context, though.  Not sure if 
that’s necessary to note, but it isn’t really an arbitrary thread, and 
block jobs certainly require this (because they run in the block 
device’s context).  Or is that something that’s going to change with I/O 
threading?




What about moving .run() before the comment and add "Must be run in the 
job's context" to its comment description?


maybe also add the following assertion in job_co_entry (that calls 
job->run())?


assert(job->aio_context == qemu_get_current_aio_context());

Emanuele




Re: [PATCH v4 24/25] job.h: split function pointers in JobDriver

2021-11-17 Thread Hanna Reitz

On 17.11.21 14:43, Emanuele Giuseppe Esposito wrote:



On 15/11/2021 16:11, Hanna Reitz wrote:

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:

The job API will be handled separately in another serie.

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
---
  include/qemu/job.h | 16 
  1 file changed, 16 insertions(+)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 6e67b6977f..7e9e59f4b8 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -169,12 +169,21 @@ typedef struct Job {
   * Callbacks and other information about a Job driver.
   */
  struct JobDriver {
+
+    /* Fields initialized in struct definition and never changed. */


Like in patch 19, I’d prefer a slightly more verbose comment that I’d 
find more easily readable.



+
  /** Derived Job struct size */
  size_t instance_size;
  /** Enum describing the operation */
  JobType job_type;
+    /*
+ * Functions run without regard to the BQL and may run in any


s/and/that/?


+ * arbitrary thread. These functions do not need to be thread-safe
+ * because the caller ensures that are invoked from one thread 
at time.


s/that/they/ (or “that they”)

I believe .run() must be run in the job’s context, though.  Not sure 
if that’s necessary to note, but it isn’t really an arbitrary thread, 
and block jobs certainly require this (because they run in the block 
device’s context).  Or is that something that’s going to change with 
I/O threading?




What about moving .run() before the comment and add "Must be run in 
the job's context" to its comment description?


Sure, works for me.

maybe also add the following assertion in job_co_entry (that calls 
job->run())?


assert(job->aio_context == qemu_get_current_aio_context());


Sounds good!

Hanna




Re: Failing QEMU iotests

2021-11-17 Thread Daniel P . Berrangé
On Wed, Nov 17, 2021 at 01:50:12PM +0100, Thomas Huth wrote:
> On 17/11/2021 11.59, Hanna Reitz wrote:
> > On 17.11.21 11:07, Thomas Huth wrote:
> > > 
> > >  Hi!
> > > 
> > > I think it has been working fine for me a couple of weeks ago,
> > > but when I now run:
> > > 
> > >  make check SPEED=slow
> > > 
> > > I'm getting a couple of failing iotests... not sure whether
> > > these are known issues already, so I thought I'd summarize them
> > > here:
> ...
> > > --- /home/thuth/devel/qemu/tests/qemu-iotests/206.out
> > > +++ 206.out.bad
> > > @@ -99,55 +99,19 @@
> > > 
> > >  {"execute": "blockdev-create", "arguments": {"job-id": "job0",
> > > "options": {"driver": "qcow2", "encrypt": {"cipher-alg":
> > > "twofish-128", "cipher-mode": "ctr", "format": "luks", "hash-alg":
> > > "sha1", "iter-time": 10, "ivgen-alg": "plain64", "ivgen-hash-alg":
> > > "md5", "key-secret": "keysec0"}, "file": {"driver": "file",
> > > "filename": "TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}
> > >  {"return": {}}
> > > +Job failed: Unsupported cipher algorithm twofish-128 with ctr mode
> > >  {"execute": "job-dismiss", "arguments": {"id": "job0"}}
> > >  {"return": {}}
> > > 
> > >  image: TEST_IMG
> > >  file format: IMGFMT
> > >  virtual size: 32 MiB (33554432 bytes)
> > > -encrypted: yes
> > >  cluster_size: 65536
> > >  Format specific information:
> > >  compat: 1.1
> > >  compression type: zlib
> > >  lazy refcounts: false
> > >  refcount bits: 16
> > > -    encrypt:
> > > -    ivgen alg: plain64
> > > -    hash alg: sha1
> > > -    cipher alg: twofish-128
> > > -    uuid: ----
> > > -    format: luks
> > > -    cipher mode: ctr
> > > -    slots:
> > > -    [0]:
> > > -    active: true
> > > -    iters: XXX
> > > -    key offset: 4096
> > > -    stripes: 4000
> > > -    [1]:
> > > -    active: false
> > > -    key offset: 69632
> > > -    [2]:
> > > -    active: false
> > > -    key offset: 135168
> > > -    [3]:
> > > -    active: false
> > > -    key offset: 200704
> > > -    [4]:
> > > -    active: false
> > > -    key offset: 266240
> > > -    [5]:
> > > -    active: false
> > > -    key offset: 331776
> > > -    [6]:
> > > -    active: false
> > > -    key offset: 397312
> > > -    [7]:
> > > -    active: false
> > > -    key offset: 462848
> > > -    payload offset: 528384
> > > -    master key iters: XXX
> > >  corrupt: false
> > >  extended l2: false
> > 
> > I doubt this worked a couple of weeks ago, but it’s definitely one that
> > we should just get around to fixing. :/
> 
> Hm, maybe I've did the successful run on a different system last time ... I
> even slightly remember now having seen this before in the past on my current
> system, so yes, likely not something new.

Triggered by me switching QEMU to prefer GNUTLS for crypto by default
in 6.1, as it doesn't bother to support obscure crypto algs that no
one uses in practice for LUKS.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH 0/2] iotests: Fix crypto algorithm failures

2021-11-17 Thread Hanna Reitz
Hi,

iotests 149, 206, and 210 fail when qemu uses the gnutls crypto backend
(which is the default as of 8bd0931f6) because they try to use
algorithms that this backend does not support.

Have 206 and 210 use different algorithms instead (patch 1), and let 149
be skipped when it encounters an unsupported algorithm (patch 2).


Hanna Reitz (2):
  iotests: Use aes-128-cbc
  iotests/149: Skip on unsupported ciphers

 tests/qemu-iotests/149 | 23 ++-
 tests/qemu-iotests/206 |  4 ++--
 tests/qemu-iotests/206.out |  6 +++---
 tests/qemu-iotests/210 |  4 ++--
 tests/qemu-iotests/210.out |  6 +++---
 5 files changed, 28 insertions(+), 15 deletions(-)

-- 
2.33.1




Re: [PATCH 2/2] iotests/149: Skip on unsupported ciphers

2021-11-17 Thread Hanna Reitz

On 17.11.21 16:01, Hanna Reitz wrote:

Whenever qemu-img or qemu-io report that some cipher is unsupported,
skip the whole test, because that is probably because qemu has been
configured with the gnutls crypto backend.

We could taylor the algorithm list to what gnutls supports, but this is
a test that is run rather rarely anyway (because it requires
password-less sudo), and so it seems better and easier to skip it.  When
this test is intentionally run to check LUKS compatibility, it seems
better not to limit the algorithms but keep the list extensive.

Signed-off-by: Hanna Reitz 
---
  tests/qemu-iotests/149 | 23 ++-
  1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/149 b/tests/qemu-iotests/149
index 328fd05a4c..adcef86e88 100755
--- a/tests/qemu-iotests/149
+++ b/tests/qemu-iotests/149
@@ -230,6 +230,18 @@ def create_image(config, size_mb):
  fn.truncate(size_mb * 1024 * 1024)
  
  
+def check_cipher_support(output):

+"""Check the output of qemu-img or qemu-io for mention of the respective
+cipher algorithm being unsupported, and if so, skip this test.
+(Returns `output` for convenience.)"""
+
+if 'Unsupported cipher algorithm' in output:
+iotests.notrun('Unsupported cipher algorithm '
+   f'{config.cipher}-{config.keylen}-{config.mode}; '


Oops.  Just when I sent this I realized that during refactoring (putting 
this code into its own function) I forgot to pass `config` as a parameter.


Didn’t notice that because...  It seems to work just fine despite 
`config` not being defined here?  Python will forever remain a black box 
for me...


Hanna


+   'consider configuring qemu with a different crypto '
+   'backend')
+return output
+





[PATCH 2/2] iotests/149: Skip on unsupported ciphers

2021-11-17 Thread Hanna Reitz
Whenever qemu-img or qemu-io report that some cipher is unsupported,
skip the whole test, because that is probably because qemu has been
configured with the gnutls crypto backend.

We could taylor the algorithm list to what gnutls supports, but this is
a test that is run rather rarely anyway (because it requires
password-less sudo), and so it seems better and easier to skip it.  When
this test is intentionally run to check LUKS compatibility, it seems
better not to limit the algorithms but keep the list extensive.

Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/149 | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/149 b/tests/qemu-iotests/149
index 328fd05a4c..adcef86e88 100755
--- a/tests/qemu-iotests/149
+++ b/tests/qemu-iotests/149
@@ -230,6 +230,18 @@ def create_image(config, size_mb):
 fn.truncate(size_mb * 1024 * 1024)
 
 
+def check_cipher_support(output):
+"""Check the output of qemu-img or qemu-io for mention of the respective
+cipher algorithm being unsupported, and if so, skip this test.
+(Returns `output` for convenience.)"""
+
+if 'Unsupported cipher algorithm' in output:
+iotests.notrun('Unsupported cipher algorithm '
+   f'{config.cipher}-{config.keylen}-{config.mode}; '
+   'consider configuring qemu with a different crypto '
+   'backend')
+return output
+
 def qemu_img_create(config, size_mb):
 """Create and format a disk image with LUKS using qemu-img"""
 
@@ -253,7 +265,8 @@ def qemu_img_create(config, size_mb):
 "%dM" % size_mb]
 
 iotests.log("qemu-img " + " ".join(args), 
filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_img_pipe(*args), 
filters=[iotests.filter_test_dir])
+iotests.log(check_cipher_support(iotests.qemu_img_pipe(*args)),
+filters=[iotests.filter_test_dir])
 
 def qemu_io_image_args(config, dev=False):
 """Get the args for access an image or device with qemu-io"""
@@ -279,8 +292,8 @@ def qemu_io_write_pattern(config, pattern, offset_mb, 
size_mb, dev=False):
 args = ["-c", "write -P 0x%x %dM %dM" % (pattern, offset_mb, size_mb)]
 args.extend(qemu_io_image_args(config, dev))
 iotests.log("qemu-io " + " ".join(args), filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_io(*args), filters=[iotests.filter_test_dir,
- iotests.filter_qemu_io])
+iotests.log(check_cipher_support(iotests.qemu_io(*args)),
+filters=[iotests.filter_test_dir, iotests.filter_qemu_io])
 
 
 def qemu_io_read_pattern(config, pattern, offset_mb, size_mb, dev=False):
@@ -291,8 +304,8 @@ def qemu_io_read_pattern(config, pattern, offset_mb, 
size_mb, dev=False):
 args = ["-c", "read -P 0x%x %dM %dM" % (pattern, offset_mb, size_mb)]
 args.extend(qemu_io_image_args(config, dev))
 iotests.log("qemu-io " + " ".join(args), filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_io(*args), filters=[iotests.filter_test_dir,
- iotests.filter_qemu_io])
+iotests.log(check_cipher_support(iotests.qemu_io(*args)),
+filters=[iotests.filter_test_dir, iotests.filter_qemu_io])
 
 
 def test_once(config, qemu_img=False):
-- 
2.33.1




[PATCH 1/2] iotests: Use aes-128-cbc

2021-11-17 Thread Hanna Reitz
Our gnutls crypto backend (which is the default as of 8bd0931f6)
supports neither twofish-128 nor the CTR mode.  CBC and aes-128 are
supported by all of our backends (as far as I can tell), so use
aes-128-cbc in our iotests.

(We could also use e.g. aes-256-cbc, but the different key sizes would
lead to different key slot offsets and so change the reference output
more, which is why I went with aes-128.)

Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/206 | 4 ++--
 tests/qemu-iotests/206.out | 6 +++---
 tests/qemu-iotests/210 | 4 ++--
 tests/qemu-iotests/210.out | 6 +++---
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/206 b/tests/qemu-iotests/206
index c3cdad4ce4..10eff343f7 100755
--- a/tests/qemu-iotests/206
+++ b/tests/qemu-iotests/206
@@ -162,8 +162,8 @@ with iotests.FilePath('t.qcow2') as disk_path, \
  'encrypt': {
  'format': 'luks',
  'key-secret': 'keysec0',
- 'cipher-alg': 'twofish-128',
- 'cipher-mode': 'ctr',
+ 'cipher-alg': 'aes-128',
+ 'cipher-mode': 'cbc',
  'ivgen-alg': 'plain64',
  'ivgen-hash-alg': 'md5',
  'hash-alg': 'sha1',
diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index 3593e8e9c2..80cd274223 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -97,7 +97,7 @@ Format specific information:
 
 === Successful image creation (encrypted) ===
 
-{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"driver": "qcow2", "encrypt": {"cipher-alg": "twofish-128", "cipher-mode": 
"ctr", "format": "luks", "hash-alg": "sha1", "iter-time": 10, "ivgen-alg": 
"plain64", "ivgen-hash-alg": "md5", "key-secret": "keysec0"}, "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}
+{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"driver": "qcow2", "encrypt": {"cipher-alg": "aes-128", "cipher-mode": "cbc", 
"format": "luks", "hash-alg": "sha1", "iter-time": 10, "ivgen-alg": "plain64", 
"ivgen-hash-alg": "md5", "key-secret": "keysec0"}, "file": {"driver": "file", 
"filename": "TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}
 {"return": {}}
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
@@ -115,10 +115,10 @@ Format specific information:
 encrypt:
 ivgen alg: plain64
 hash alg: sha1
-cipher alg: twofish-128
+cipher alg: aes-128
 uuid: ----
 format: luks
-cipher mode: ctr
+cipher mode: cbc
 slots:
 [0]:
 active: true
diff --git a/tests/qemu-iotests/210 b/tests/qemu-iotests/210
index 5a62ed4dd1..a4dcc5fe59 100755
--- a/tests/qemu-iotests/210
+++ b/tests/qemu-iotests/210
@@ -83,8 +83,8 @@ with iotests.FilePath('t.luks') as disk_path, \
  },
  'size': size,
  'key-secret': 'keysec0',
- 'cipher-alg': 'twofish-128',
- 'cipher-mode': 'ctr',
+ 'cipher-alg': 'aes-128',
+ 'cipher-mode': 'cbc',
  'ivgen-alg': 'plain64',
  'ivgen-hash-alg': 'md5',
  'hash-alg': 'sha1',
diff --git a/tests/qemu-iotests/210.out b/tests/qemu-iotests/210.out
index 55c0844370..96d9f749dd 100644
--- a/tests/qemu-iotests/210.out
+++ b/tests/qemu-iotests/210.out
@@ -59,7 +59,7 @@ Format specific information:
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
 
-{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"cipher-alg": "twofish-128", "cipher-mode": "ctr", "driver": "luks", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.luks"}, "hash-alg": "sha1", 
"iter-time": 10, "ivgen-alg": "plain64", "ivgen-hash-alg": "md5", "key-secret": 
"keysec0", "size": 67108864}}}
+{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"cipher-alg": "aes-128", "cipher-mode": "cbc", "driver": "luks", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.luks"}, "hash-alg": "sha1", 
"iter-time": 10, "ivgen-alg": "plain64", "ivgen-hash-alg": "md5", "key-secret": 
"keysec0", "size": 67108864}}}
 {"return": {}}
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
@@ -71,9 +71,9 @@ encrypted: yes
 Format specific information:
 ivgen alg: plain64
 hash alg: sha1
-cipher alg: twofish-128
+cipher alg: aes-128
 uuid: ----
-cipher mode: ctr
+cipher mode: cbc
 slots:
 [0]:
 active: true
-- 
2.33.1




[PATCH v2 0/2] iotests: Fix crypto algorithm failures

2021-11-17 Thread Hanna Reitz
Hi,

iotests 149, 206, and 210 fail when qemu uses the gnutls crypto backend
(which is the default as of 8bd0931f6) because they try to use
algorithms that this backend does not support.

Have 206 and 210 use different algorithms instead (patch 1), and let 149
be skipped when it encounters an unsupported algorithm (patch 2).


v2:
- Fixed the `check_cipher_support()` function introduced in patch 2
  (forgot to pass `config`, though it worked even without, apparently
  because `config` is a global-scope variable)

- Also a good opportunity to CC Dan, who I forgot on v1


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/2:[] [--] 'iotests: Use aes-128-cbc'
002/2:[0008] [FC] 'iotests/149: Skip on unsupported ciphers'


Hanna Reitz (2):
  iotests: Use aes-128-cbc
  iotests/149: Skip on unsupported ciphers

 tests/qemu-iotests/149 | 23 ++-
 tests/qemu-iotests/206 |  4 ++--
 tests/qemu-iotests/206.out |  6 +++---
 tests/qemu-iotests/210 |  4 ++--
 tests/qemu-iotests/210.out |  6 +++---
 5 files changed, 28 insertions(+), 15 deletions(-)

-- 
2.33.1




[PATCH v2 2/2] iotests/149: Skip on unsupported ciphers

2021-11-17 Thread Hanna Reitz
Whenever qemu-img or qemu-io report that some cipher is unsupported,
skip the whole test, because that is probably because qemu has been
configured with the gnutls crypto backend.

We could taylor the algorithm list to what gnutls supports, but this is
a test that is run rather rarely anyway (because it requires
password-less sudo), and so it seems better and easier to skip it.  When
this test is intentionally run to check LUKS compatibility, it seems
better not to limit the algorithms but keep the list extensive.

Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/149 | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/149 b/tests/qemu-iotests/149
index 328fd05a4c..d49646ca60 100755
--- a/tests/qemu-iotests/149
+++ b/tests/qemu-iotests/149
@@ -230,6 +230,18 @@ def create_image(config, size_mb):
 fn.truncate(size_mb * 1024 * 1024)
 
 
+def check_cipher_support(config, output):
+"""Check the output of qemu-img or qemu-io for mention of the respective
+cipher algorithm being unsupported, and if so, skip this test.
+(Returns `output` for convenience.)"""
+
+if 'Unsupported cipher algorithm' in output:
+iotests.notrun('Unsupported cipher algorithm '
+   f'{config.cipher}-{config.keylen}-{config.mode}; '
+   'consider configuring qemu with a different crypto '
+   'backend')
+return output
+
 def qemu_img_create(config, size_mb):
 """Create and format a disk image with LUKS using qemu-img"""
 
@@ -253,7 +265,8 @@ def qemu_img_create(config, size_mb):
 "%dM" % size_mb]
 
 iotests.log("qemu-img " + " ".join(args), 
filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_img_pipe(*args), 
filters=[iotests.filter_test_dir])
+iotests.log(check_cipher_support(config, iotests.qemu_img_pipe(*args)),
+filters=[iotests.filter_test_dir])
 
 def qemu_io_image_args(config, dev=False):
 """Get the args for access an image or device with qemu-io"""
@@ -279,8 +292,8 @@ def qemu_io_write_pattern(config, pattern, offset_mb, 
size_mb, dev=False):
 args = ["-c", "write -P 0x%x %dM %dM" % (pattern, offset_mb, size_mb)]
 args.extend(qemu_io_image_args(config, dev))
 iotests.log("qemu-io " + " ".join(args), filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_io(*args), filters=[iotests.filter_test_dir,
- iotests.filter_qemu_io])
+iotests.log(check_cipher_support(config, iotests.qemu_io(*args)),
+filters=[iotests.filter_test_dir, iotests.filter_qemu_io])
 
 
 def qemu_io_read_pattern(config, pattern, offset_mb, size_mb, dev=False):
@@ -291,8 +304,8 @@ def qemu_io_read_pattern(config, pattern, offset_mb, 
size_mb, dev=False):
 args = ["-c", "read -P 0x%x %dM %dM" % (pattern, offset_mb, size_mb)]
 args.extend(qemu_io_image_args(config, dev))
 iotests.log("qemu-io " + " ".join(args), filters=[iotests.filter_test_dir])
-iotests.log(iotests.qemu_io(*args), filters=[iotests.filter_test_dir,
- iotests.filter_qemu_io])
+iotests.log(check_cipher_support(config, iotests.qemu_io(*args)),
+filters=[iotests.filter_test_dir, iotests.filter_qemu_io])
 
 
 def test_once(config, qemu_img=False):
-- 
2.33.1




[PATCH v2 1/2] iotests: Use aes-128-cbc

2021-11-17 Thread Hanna Reitz
Our gnutls crypto backend (which is the default as of 8bd0931f6)
supports neither twofish-128 nor the CTR mode.  CBC and aes-128 are
supported by all of our backends (as far as I can tell), so use
aes-128-cbc in our iotests.

(We could also use e.g. aes-256-cbc, but the different key sizes would
lead to different key slot offsets and so change the reference output
more, which is why I went with aes-128.)

Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/206 | 4 ++--
 tests/qemu-iotests/206.out | 6 +++---
 tests/qemu-iotests/210 | 4 ++--
 tests/qemu-iotests/210.out | 6 +++---
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/206 b/tests/qemu-iotests/206
index c3cdad4ce4..10eff343f7 100755
--- a/tests/qemu-iotests/206
+++ b/tests/qemu-iotests/206
@@ -162,8 +162,8 @@ with iotests.FilePath('t.qcow2') as disk_path, \
  'encrypt': {
  'format': 'luks',
  'key-secret': 'keysec0',
- 'cipher-alg': 'twofish-128',
- 'cipher-mode': 'ctr',
+ 'cipher-alg': 'aes-128',
+ 'cipher-mode': 'cbc',
  'ivgen-alg': 'plain64',
  'ivgen-hash-alg': 'md5',
  'hash-alg': 'sha1',
diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index 3593e8e9c2..80cd274223 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -97,7 +97,7 @@ Format specific information:
 
 === Successful image creation (encrypted) ===
 
-{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"driver": "qcow2", "encrypt": {"cipher-alg": "twofish-128", "cipher-mode": 
"ctr", "format": "luks", "hash-alg": "sha1", "iter-time": 10, "ivgen-alg": 
"plain64", "ivgen-hash-alg": "md5", "key-secret": "keysec0"}, "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}
+{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"driver": "qcow2", "encrypt": {"cipher-alg": "aes-128", "cipher-mode": "cbc", 
"format": "luks", "hash-alg": "sha1", "iter-time": 10, "ivgen-alg": "plain64", 
"ivgen-hash-alg": "md5", "key-secret": "keysec0"}, "file": {"driver": "file", 
"filename": "TEST_DIR/PID-t.qcow2"}, "size": 33554432}}}
 {"return": {}}
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
@@ -115,10 +115,10 @@ Format specific information:
 encrypt:
 ivgen alg: plain64
 hash alg: sha1
-cipher alg: twofish-128
+cipher alg: aes-128
 uuid: ----
 format: luks
-cipher mode: ctr
+cipher mode: cbc
 slots:
 [0]:
 active: true
diff --git a/tests/qemu-iotests/210 b/tests/qemu-iotests/210
index 5a62ed4dd1..a4dcc5fe59 100755
--- a/tests/qemu-iotests/210
+++ b/tests/qemu-iotests/210
@@ -83,8 +83,8 @@ with iotests.FilePath('t.luks') as disk_path, \
  },
  'size': size,
  'key-secret': 'keysec0',
- 'cipher-alg': 'twofish-128',
- 'cipher-mode': 'ctr',
+ 'cipher-alg': 'aes-128',
+ 'cipher-mode': 'cbc',
  'ivgen-alg': 'plain64',
  'ivgen-hash-alg': 'md5',
  'hash-alg': 'sha1',
diff --git a/tests/qemu-iotests/210.out b/tests/qemu-iotests/210.out
index 55c0844370..96d9f749dd 100644
--- a/tests/qemu-iotests/210.out
+++ b/tests/qemu-iotests/210.out
@@ -59,7 +59,7 @@ Format specific information:
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
 
-{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"cipher-alg": "twofish-128", "cipher-mode": "ctr", "driver": "luks", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.luks"}, "hash-alg": "sha1", 
"iter-time": 10, "ivgen-alg": "plain64", "ivgen-hash-alg": "md5", "key-secret": 
"keysec0", "size": 67108864}}}
+{"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": 
{"cipher-alg": "aes-128", "cipher-mode": "cbc", "driver": "luks", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-t.luks"}, "hash-alg": "sha1", 
"iter-time": 10, "ivgen-alg": "plain64", "ivgen-hash-alg": "md5", "key-secret": 
"keysec0", "size": 67108864}}}
 {"return": {}}
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
@@ -71,9 +71,9 @@ encrypted: yes
 Format specific information:
 ivgen alg: plain64
 hash alg: sha1
-cipher alg: twofish-128
+cipher alg: aes-128
 uuid: ----
-cipher mode: ctr
+cipher mode: cbc
 slots:
 [0]:
 active: true
-- 
2.33.1




Re: [PATCH v2 2/2] iotests/149: Skip on unsupported ciphers

2021-11-17 Thread Daniel P . Berrangé
On Wed, Nov 17, 2021 at 04:17:07PM +0100, Hanna Reitz wrote:
> Whenever qemu-img or qemu-io report that some cipher is unsupported,
> skip the whole test, because that is probably because qemu has been
> configured with the gnutls crypto backend.
> 
> We could taylor the algorithm list to what gnutls supports, but this is
> a test that is run rather rarely anyway (because it requires
> password-less sudo), and so it seems better and easier to skip it.  When
> this test is intentionally run to check LUKS compatibility, it seems
> better not to limit the algorithms but keep the list extensive.

I'd really like to figure out a way to be able to partially run
this test. When I have hit problems in the past, I needed to
run specific tests, but then the expected output always contains
everything.  I've thought of a few options

 - Split it into many stanadlone tests - eg
 tests/qemu-iotests/tests/luks-host-$ALG

 - Split only the expected output eg
 
 149-$SUBTEST

  and have a way to indicate which of expected output files
  we need to concatenate for the set of subtests that we
  run.

 - Introduce some template syntax in expected output
   tha can be used to munge the output.

 - Stop comparing expected output entirely and just
   then this into a normal python unit test.

 - Insert your idea here ?

> 
> Signed-off-by: Hanna Reitz 
> ---
>  tests/qemu-iotests/149 | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 1/2] iotests: Use aes-128-cbc

2021-11-17 Thread Daniel P . Berrangé
On Wed, Nov 17, 2021 at 04:17:06PM +0100, Hanna Reitz wrote:
> Our gnutls crypto backend (which is the default as of 8bd0931f6)
> supports neither twofish-128 nor the CTR mode.  CBC and aes-128 are
> supported by all of our backends (as far as I can tell), so use
> aes-128-cbc in our iotests.

Yes, AES is guarnateed by all backends, as is ECB,CBC & XTS modes.

> 
> (We could also use e.g. aes-256-cbc, but the different key sizes would
> lead to different key slot offsets and so change the reference output
> more, which is why I went with aes-128.)
> 
> Signed-off-by: Hanna Reitz 
> ---
>  tests/qemu-iotests/206 | 4 ++--
>  tests/qemu-iotests/206.out | 6 +++---
>  tests/qemu-iotests/210 | 4 ++--
>  tests/qemu-iotests/210.out | 6 +++---
>  4 files changed, 10 insertions(+), 10 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2] nbd/server: Suppress Broken pipe errors on abrupt disconnection

2021-11-17 Thread Eric Blake
Revisiting an older thread

On Mon, Sep 13, 2021 at 04:19:36PM +0100, Richard W.M. Jones wrote:
> $ rm -f /tmp/sock /tmp/pid
> $ qemu-img create -f qcow2 /tmp/disk.qcow2 1M
> $ qemu-nbd -t --format=qcow2 --socket=/tmp/sock --pid-file=/tmp/pid 
> /tmp/disk.qcow2 &
> $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()'
> qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to 
> socket: Broken pipe
> $ killall qemu-nbd
> 
> nbdsh is abruptly dropping the NBD connection here which is a valid
> way to close the connection.  It seems unnecessary to print an error
> in this case so this commit suppresses it.

I've investigated this a bit further, and found that we have a
regression in 6.0 when the abrupt disconnect occurs over TCP instead
of Unix sockets.  Prior to that point, if a client does a hard
disconnect with no pending message, a 5.2 server reports:

$ qemu-nbd -f raw file &
[1] 1059670
$ nbdsh -u nbd://localhost:10809 -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected 
end-of-file before all bytes were read
[1]+  Doneqemu-nbd -f raw file

but in 6.0, we started reporting:

qemu-nbd: Disconnect client, due to: Request handling failed in intermediate 
state

and looking further, I discovered that if you also use --trace=nbd_\*,
qemu-nbd ends up tracing uninitialized memory; it is now failing
because it sees an unexpected magic number from the uninitialized
memory, rather than its earlier detection of early EOF.

It's interesting that a Unix socket sees EPIPE when a TCP socket does
not, but it also means that your patch in isolation won't solve the
TCP issue, so I'm trying to come up with something today.

> 
> Note that if you call the nbdsh h.shutdown() method then the message
> was not printed:
> 
> $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()' -c 'h.shutdown()'
> 
> Signed-off-by: Richard W.M. Jones 
> ---
>  nbd/server.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 3927f7789d..e0a43802b2 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -2669,7 +2669,12 @@ static coroutine_fn void nbd_trip(void *opaque)
>  ret = nbd_handle_request(client, &request, req->data, &local_err);
>  }
>  if (ret < 0) {
> -error_prepend(&local_err, "Failed to send reply: ");
> +if (errno != EPIPE) {
> +error_prepend(&local_err, "Failed to send reply: ");
> +} else {
> +error_free(local_err);
> +local_err = NULL;
> +}
>  goto disconnect;
>  }
>  
> -- 
> 2.32.0
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




[PATCH v2 00/13] Eliminate drive_get_next()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If
the order changes, or new calls appear "in the middle", unit numbers
change.  ABI break.  Hard to spot in review.  Replace its uses by
drive_get(), then delete it.

Markus Armbruster (13):
  hw/sd/ssi-sd: Do not create SD card within controller's realize
  hw: Replace trivial drive_get_next() by drive_get()
  hw/arm/npcm7xx_boards: Replace drive_get_next() by drive_get()
  hw/arm/versatilepb hw/arm/vexpress: Replace drive_get_next() by
drive_get()
  hw/arm/imx25_pdk: Replace drive_get_next() by drive_get()
  hw/arm/mcimx6ul-evk: Replace drive_get_next() by drive_get()
  hw/arm/mcimx7d-sabre: Replace drive_get_next() by drive_get()
  hw/arm/xlnx-versal-virt: Replace drive_get_next() by drive_get()
  hw/microblaze: Replace drive_get_next() by drive_get()
  hw/arm/xlnx-zcu102: Replace drive_get_next() by drive_get()
  hw/arm/xilinx_zynq: Replace drive_get_next() by drive_get()
  hw/arm/aspeed: Replace drive_get_next() by drive_get()
  blockdev: Drop unused drive_get_next()

 include/sysemu/blockdev.h   |  1 -
 blockdev.c  | 10 --
 hw/arm/aspeed.c | 21 +
 hw/arm/cubieboard.c |  2 +-
 hw/arm/imx25_pdk.c  |  2 +-
 hw/arm/integratorcp.c   |  2 +-
 hw/arm/mcimx6ul-evk.c   |  2 +-
 hw/arm/mcimx7d-sabre.c  |  2 +-
 hw/arm/msf2-som.c   |  2 +-
 hw/arm/npcm7xx_boards.c |  6 +++---
 hw/arm/orangepi.c   |  2 +-
 hw/arm/raspi.c  |  2 +-
 hw/arm/realview.c   |  2 +-
 hw/arm/sabrelite.c  |  2 +-
 hw/arm/stellaris.c  | 15 ++-
 hw/arm/versatilepb.c|  4 ++--
 hw/arm/vexpress.c   |  6 +++---
 hw/arm/xilinx_zynq.c| 16 +---
 hw/arm/xlnx-versal-virt.c   |  3 ++-
 hw/arm/xlnx-zcu102.c|  6 +++---
 hw/microblaze/petalogix_ml605_mmu.c |  2 +-
 hw/misc/sifive_u_otp.c  |  2 +-
 hw/riscv/microchip_pfsoc.c  |  2 +-
 hw/riscv/sifive_u.c | 15 +--
 hw/sd/ssi-sd.c  | 29 +
 hw/sparc64/niagara.c|  2 +-
 26 files changed, 77 insertions(+), 83 deletions(-)

-- 
2.31.1




[PATCH v2 05/13] hw/arm/imx25_pdk: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "imx25-pdk" connects backends with drive_get_next() in a
counting loop.  Change it to use drive_get() directly.  This makes the
unit numbers explicit in the code.

Cc: Peter Maydell 
Cc: Jean-Christophe Dubois 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/imx25_pdk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/imx25_pdk.c b/hw/arm/imx25_pdk.c
index bd16acd4d9..6dff000163 100644
--- a/hw/arm/imx25_pdk.c
+++ b/hw/arm/imx25_pdk.c
@@ -123,7 +123,7 @@ static void imx25_pdk_init(MachineState *machine)
 DriveInfo *di;
 BlockBackend *blk;
 
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, i);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(&s->soc.esdhc[i]), "sd-bus");
 carddev = qdev_new(TYPE_SD_CARD);
-- 
2.31.1




[PATCH v2 04/13] hw/arm/versatilepb hw/arm/vexpress: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

The versatile and vexpress machines ("versatileab", "versatilepb",
"vexpress-a9", "vexpress-a15") connect just one or two backends of a
type with drive_get_next().  Change them to use drive_get() directly.
This makes the unit numbers explicit in the code.

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/versatilepb.c | 4 ++--
 hw/arm/vexpress.c| 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
index 575399c4fc..ecc1f6cf74 100644
--- a/hw/arm/versatilepb.c
+++ b/hw/arm/versatilepb.c
@@ -310,7 +310,7 @@ static void versatile_init(MachineState *machine, int 
board_id)
 qdev_connect_gpio_out(sysctl, 0, qdev_get_gpio_in(dev, 0));
 
 dev = sysbus_create_varargs("pl181", 0x10005000, sic[22], sic[1], NULL);
-dinfo = drive_get_next(IF_SD);
+dinfo = drive_get(IF_SD, 0, 0);
 if (dinfo) {
 DeviceState *card;
 
@@ -322,7 +322,7 @@ static void versatile_init(MachineState *machine, int 
board_id)
 }
 
 dev = sysbus_create_varargs("pl181", 0x1000b000, sic[23], sic[2], NULL);
-dinfo = drive_get_next(IF_SD);
+dinfo = drive_get(IF_SD, 0, 1);
 if (dinfo) {
 DeviceState *card;
 
diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index 58481c0762..966758cf82 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -625,7 +625,7 @@ static void vexpress_common_init(MachineState *machine)
   qdev_get_gpio_in(sysctl, ARM_SYSCTL_GPIO_MMC_WPROT));
 qdev_connect_gpio_out_named(dev, "card-inserted", 0,
   qdev_get_gpio_in(sysctl, 
ARM_SYSCTL_GPIO_MMC_CARDIN));
-dinfo = drive_get_next(IF_SD);
+dinfo = drive_get(IF_SD, 0, 0);
 if (dinfo) {
 DeviceState *card;
 
@@ -657,7 +657,7 @@ static void vexpress_common_init(MachineState *machine)
 
 sysbus_create_simple("pl111", map[VE_CLCD], pic[14]);
 
-dinfo = drive_get_next(IF_PFLASH);
+dinfo = drive_get(IF_PFLASH, 0, 0);
 pflash0 = ve_pflash_cfi01_register(map[VE_NORFLASH0], "vexpress.flash0",
dinfo);
 if (!pflash0) {
@@ -673,7 +673,7 @@ static void vexpress_common_init(MachineState *machine)
 memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], flashalias);
 }
 
-dinfo = drive_get_next(IF_PFLASH);
+dinfo = drive_get(IF_PFLASH, 0, 1);
 if (!ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1",
   dinfo)) {
 error_report("vexpress: error registering flash 1");
-- 
2.31.1




[PATCH v2 10/13] hw/arm/xlnx-zcu102: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "xlnx-zcu102" connects backends with drive_get_next() in
several counting loops.  Change it to use drive_get() directly.  This
makes the unit numbers explicit in the code.

Cc: Alistair Francis 
Cc: "Edgar E. Iglesias" 
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/xlnx-zcu102.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
index 3dc2b5e8ca..45eb19ab3b 100644
--- a/hw/arm/xlnx-zcu102.c
+++ b/hw/arm/xlnx-zcu102.c
@@ -169,7 +169,7 @@ static void xlnx_zcu102_init(MachineState *machine)
 /* Create and plug in the SD cards */
 for (i = 0; i < XLNX_ZYNQMP_NUM_SDHCI; i++) {
 BusState *bus;
-DriveInfo *di = drive_get_next(IF_SD);
+DriveInfo *di = drive_get(IF_SD, 0, i);
 BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
 DeviceState *carddev;
 char *bus_name;
@@ -190,7 +190,7 @@ static void xlnx_zcu102_init(MachineState *machine)
 BusState *spi_bus;
 DeviceState *flash_dev;
 qemu_irq cs_line;
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, i);
 gchar *bus_name = g_strdup_printf("spi%d", i);
 
 spi_bus = qdev_get_child_bus(DEVICE(&s->soc), bus_name);
@@ -212,7 +212,7 @@ static void xlnx_zcu102_init(MachineState *machine)
 BusState *spi_bus;
 DeviceState *flash_dev;
 qemu_irq cs_line;
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, XLNX_ZYNQMP_NUM_SPIS + i);
 int bus = i / XLNX_ZYNQMP_NUM_QSPI_BUS_CS;
 gchar *bus_name = g_strdup_printf("qspi%d", bus);
 
-- 
2.31.1




[PATCH v2 01/13] hw/sd/ssi-sd: Do not create SD card within controller's realize

2021-11-17 Thread Markus Armbruster
ssi_sd_realize() creates an "sd-card" device.  This is inappropriate,
and marked FIXME.

Move it to the boards that create these devices.  Prior art: commit
eb4f566bbb for device "generic-sdhci", and commit 26c607b86b for
device "pl181".

The device remains not user-creatable, because its users should (and
do) wire up its GPIO chip-select line.

Cc: Peter Maydell 
Cc: Alistair Francis 
Cc: Bin Meng 
Cc: Palmer Dabbelt 
Cc: "Philippe Mathieu-Daudé" 
Cc: qemu-...@nongnu.org
Cc: qemu-ri...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/stellaris.c  | 15 ++-
 hw/riscv/sifive_u.c | 13 -
 hw/sd/ssi-sd.c  | 29 +
 3 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
index 78827ace6b..b6c8a5d609 100644
--- a/hw/arm/stellaris.c
+++ b/hw/arm/stellaris.c
@@ -10,6 +10,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "hw/sysbus.h"
+#include "hw/sd/sd.h"
 #include "hw/ssi/ssi.h"
 #include "hw/arm/boot.h"
 #include "qemu/timer.h"
@@ -1157,6 +1158,9 @@ static void stellaris_init(MachineState *ms, 
stellaris_board_info *board)
 void *bus;
 DeviceState *sddev;
 DeviceState *ssddev;
+DriveInfo *dinfo;
+DeviceState *carddev;
+BlockBackend *blk;
 
 /*
  * Some boards have both an OLED controller and SD card connected 
to
@@ -1221,8 +1225,17 @@ static void stellaris_init(MachineState *ms, 
stellaris_board_info *board)
  *  - Make the ssd0323 OLED controller chipselect active-low
  */
 bus = qdev_get_child_bus(dev, "ssi");
-
 sddev = ssi_create_peripheral(bus, "ssi-sd");
+
+dinfo = drive_get(IF_SD, 0, 0);
+blk = dinfo ? blk_by_legacy_dinfo(dinfo) : NULL;
+carddev = qdev_new(TYPE_SD_CARD);
+qdev_prop_set_drive_err(carddev, "drive", blk, &error_fatal);
+qdev_prop_set_bit(carddev, "spi", true);
+qdev_realize_and_unref(carddev,
+   qdev_get_child_bus(sddev, "sd-bus"),
+   &error_fatal);
+
 ssddev = ssi_create_peripheral(bus, "ssd0323");
 gpio_out[GPIO_D][0] = qemu_irq_split(
 qdev_get_gpio_in_named(sddev, SSI_GPIO_CS, 0),
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 589ae72a59..a4ecadaf12 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -46,6 +46,7 @@
 #include "hw/char/serial.h"
 #include "hw/cpu/cluster.h"
 #include "hw/misc/unimp.h"
+#include "hw/sd/sd.h"
 #include "hw/ssi/ssi.h"
 #include "target/riscv/cpu.h"
 #include "hw/riscv/riscv_hart.h"
@@ -536,7 +537,8 @@ static void sifive_u_machine_init(MachineState *machine)
 uint32_t fdt_load_addr;
 uint64_t kernel_entry;
 DriveInfo *dinfo;
-DeviceState *flash_dev, *sd_dev;
+BlockBackend *blk;
+DeviceState *flash_dev, *sd_dev, *card_dev;
 qemu_irq flash_cs, sd_cs;
 
 /* Initialize SoC */
@@ -686,6 +688,15 @@ static void sifive_u_machine_init(MachineState *machine)
 
 sd_cs = qdev_get_gpio_in_named(sd_dev, SSI_GPIO_CS, 0);
 sysbus_connect_irq(SYS_BUS_DEVICE(&s->soc.spi2), 1, sd_cs);
+
+dinfo = drive_get(IF_SD, 0, 0);
+blk = dinfo ? blk_by_legacy_dinfo(dinfo) : NULL;
+card_dev = qdev_new(TYPE_SD_CARD);
+qdev_prop_set_drive_err(card_dev, "drive", blk, &error_fatal);
+qdev_prop_set_bit(card_dev, "spi", true);
+qdev_realize_and_unref(card_dev,
+   qdev_get_child_bus(sd_dev, "sd-bus"),
+   &error_fatal);
 }
 
 static bool sifive_u_machine_get_start_in_flash(Object *obj, Error **errp)
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
index e60854eeef..167c03b780 100644
--- a/hw/sd/ssi-sd.c
+++ b/hw/sd/ssi-sd.c
@@ -368,36 +368,9 @@ static const VMStateDescription vmstate_ssi_sd = {
 
 static void ssi_sd_realize(SSIPeripheral *d, Error **errp)
 {
-ERRP_GUARD();
 ssi_sd_state *s = SSI_SD(d);
-DeviceState *carddev;
-DriveInfo *dinfo;
 
 qbus_init(&s->sdbus, sizeof(s->sdbus), TYPE_SD_BUS, DEVICE(d), "sd-bus");
-
-/* Create and plug in the sd card */
-/* FIXME use a qdev drive property instead of drive_get_next() */
-dinfo = drive_get_next(IF_SD);
-carddev = qdev_new(TYPE_SD_CARD);
-if (dinfo) {
-if (!qdev_prop_set_drive_err(carddev, "drive",
- blk_by_legacy_dinfo(dinfo), errp)) {
-goto fail;
-}
-}
-
-if (!object_property_set_bool(OBJECT(carddev), "spi", true, errp)) {
-goto fail;
-}
-
-if (!qdev_realize_and_unref(carddev, BUS(&s->sdbus), errp)) {
-goto fail;
-}
-
-return;
-
-fail:
-error_prepend(errp, "failed to init SD card: ");
 }
 
 static void ssi_sd_reset(DeviceState *dev)
@@ -426,7 +399,7 @@ static void ssi_sd_class_init(ObjectClass *klass, void 
*data)
 k->cs

[PATCH v2 03/13] hw/arm/npcm7xx_boards: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "quanta-gbs-bmc" connects just one backend with
drive_get_next(), but with a helper function.  Change it to use
drive_get() directly.  This makes the unit numbers explicit in the
code.

Cc: Havard Skinnemoen 
Cc: Tyrone Ting 
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/npcm7xx_boards.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
index dec7d16ae5..d8a49e4e85 100644
--- a/hw/arm/npcm7xx_boards.c
+++ b/hw/arm/npcm7xx_boards.c
@@ -84,9 +84,9 @@ static void npcm7xx_connect_dram(NPCM7xxState *soc, 
MemoryRegion *dram)
  &error_abort);
 }
 
-static void sdhci_attach_drive(SDHCIState *sdhci)
+static void sdhci_attach_drive(SDHCIState *sdhci, int unit)
 {
-DriveInfo *di = drive_get_next(IF_SD);
+DriveInfo *di = drive_get(IF_SD, 0, unit);
 BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
 
 BusState *bus = qdev_get_child_bus(DEVICE(sdhci), "sd-bus");
@@ -374,7 +374,7 @@ static void quanta_gbs_init(MachineState *machine)
   drive_get(IF_MTD, 0, 0));
 
 quanta_gbs_i2c_init(soc);
-sdhci_attach_drive(&soc->mmc.sdhci);
+sdhci_attach_drive(&soc->mmc.sdhci, 0);
 npcm7xx_load_kernel(machine, soc);
 }
 
-- 
2.31.1




[PATCH v2 08/13] hw/arm/xlnx-versal-virt: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "xlnx-versal-virt" connects backends with drive_get_next() in
a counting loop.  Change it to use drive_get() directly.  This makes
the unit numbers explicit in the code.

Cc: Alistair Francis 
Cc: "Edgar E. Iglesias" 
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/xlnx-versal-virt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index d2f55e29b6..0c5edc898e 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -669,7 +669,8 @@ static void versal_virt_init(MachineState *machine)
 
 /* Plugin SD cards.  */
 for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) {
-sd_plugin_card(&s->soc.pmc.iou.sd[i], drive_get_next(IF_SD));
+sd_plugin_card(&s->soc.pmc.iou.sd[i],
+   drive_get(IF_SD, 0, i));
 }
 
 s->binfo.ram_size = machine->ram_size;
-- 
2.31.1




[PATCH v2 09/13] hw/microblaze: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "petalogix-ml605" connects backends with drive_get_next() in a
counting loop.  Change it to use drive_get() directly.  This makes the
unit numbers explicit in the code.

Cc: "Edgar E. Iglesias" 
Signed-off-by: Markus Armbruster 
---
 hw/microblaze/petalogix_ml605_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/microblaze/petalogix_ml605_mmu.c 
b/hw/microblaze/petalogix_ml605_mmu.c
index 159db6cbe2..a24fadddca 100644
--- a/hw/microblaze/petalogix_ml605_mmu.c
+++ b/hw/microblaze/petalogix_ml605_mmu.c
@@ -183,7 +183,7 @@ petalogix_ml605_init(MachineState *machine)
 spi = (SSIBus *)qdev_get_child_bus(dev, "spi");
 
 for (i = 0; i < NUM_SPI_FLASHES; i++) {
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, i);
 qemu_irq cs_line;
 
 dev = qdev_new("n25q128");
-- 
2.31.1




[PATCH v2 13/13] blockdev: Drop unused drive_get_next()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

The previous commits eliminated all uses.  Drop the function.

Cc: Kevin Wolf 
Cc: Hanna Reitz 
Signed-off-by: Markus Armbruster 
---
 include/sysemu/blockdev.h |  1 -
 blockdev.c| 10 --
 2 files changed, 11 deletions(-)

diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index 32c2d6023c..a750f99b79 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -50,7 +50,6 @@ void drive_check_orphaned(void);
 DriveInfo *drive_get_by_index(BlockInterfaceType type, int index);
 int drive_get_max_bus(BlockInterfaceType type);
 int drive_get_max_devs(BlockInterfaceType type);
-DriveInfo *drive_get_next(BlockInterfaceType type);
 
 QemuOpts *drive_def(const char *optstr);
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
diff --git a/blockdev.c b/blockdev.c
index b35072644e..0eb2823b1b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -303,16 +303,6 @@ int drive_get_max_bus(BlockInterfaceType type)
 return max_bus;
 }
 
-/* Get a block device.  This should only be used for single-drive devices
-   (e.g. SD/Floppy/MTD).  Multi-disk devices (scsi/ide) should use the
-   appropriate bus.  */
-DriveInfo *drive_get_next(BlockInterfaceType type)
-{
-static int next_block_unit[IF_COUNT];
-
-return drive_get(type, 0, next_block_unit[type]++);
-}
-
 static void bdrv_format_print(void *opaque, const char *name)
 {
 qemu_printf(" %s", name);
-- 
2.31.1




[PATCH v2 11/13] hw/arm/xilinx_zynq: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "xlnx-zcu102" connects backends with drive_get_next() in two
counting loops, one of them in a helper function.  Change it to use
drive_get() directly.  This makes the unit numbers explicit in the
code.

Cc: "Edgar E. Iglesias" 
Cc: Alistair Francis 
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/xilinx_zynq.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 69c333e91b..50e7268396 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -125,9 +125,10 @@ static void gem_init(NICInfo *nd, uint32_t base, qemu_irq 
irq)
 sysbus_connect_irq(s, 0, irq);
 }
 
-static inline void zynq_init_spi_flashes(uint32_t base_addr, qemu_irq irq,
- bool is_qspi)
+static inline int zynq_init_spi_flashes(uint32_t base_addr, qemu_irq irq,
+bool is_qspi, int unit0)
 {
+int unit = unit0;
 DeviceState *dev;
 SysBusDevice *busdev;
 SSIBus *spi;
@@ -156,7 +157,7 @@ static inline void zynq_init_spi_flashes(uint32_t 
base_addr, qemu_irq irq,
 spi = (SSIBus *)qdev_get_child_bus(dev, bus_name);
 
 for (j = 0; j < num_ss; ++j) {
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, unit++);
 flash_dev = qdev_new("n25q128");
 if (dinfo) {
 qdev_prop_set_drive_err(flash_dev, "drive",
@@ -170,6 +171,7 @@ static inline void zynq_init_spi_flashes(uint32_t 
base_addr, qemu_irq irq,
 }
 }
 
+return unit;
 }
 
 static void zynq_init(MachineState *machine)
@@ -247,9 +249,9 @@ static void zynq_init(MachineState *machine)
 pic[n] = qdev_get_gpio_in(dev, n);
 }
 
-zynq_init_spi_flashes(0xE0006000, pic[58-IRQ_OFFSET], false);
-zynq_init_spi_flashes(0xE0007000, pic[81-IRQ_OFFSET], false);
-zynq_init_spi_flashes(0xE000D000, pic[51-IRQ_OFFSET], true);
+n = zynq_init_spi_flashes(0xE0006000, pic[58 - IRQ_OFFSET], false, 0);
+n = zynq_init_spi_flashes(0xE0007000, pic[81 - IRQ_OFFSET], false, n);
+n = zynq_init_spi_flashes(0xE000D000, pic[51 - IRQ_OFFSET], true, n);
 
 sysbus_create_simple(TYPE_CHIPIDEA, 0xE0002000, pic[53 - IRQ_OFFSET]);
 sysbus_create_simple(TYPE_CHIPIDEA, 0xE0003000, pic[76 - IRQ_OFFSET]);
@@ -298,7 +300,7 @@ static void zynq_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, hci_addr);
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[hci_irq - IRQ_OFFSET]);
 
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, n);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 carddev = qdev_new(TYPE_SD_CARD);
 qdev_prop_set_drive_err(carddev, "drive", blk, &error_fatal);
-- 
2.31.1




[PATCH v2 06/13] hw/arm/mcimx6ul-evk: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "mcimx6ul-evk" connects backends with drive_get_next() in a
counting loop.  Change it to use drive_get() directly.  This makes the
unit numbers explicit in the code.

Cc: Peter Maydell 
Cc: Jean-Christophe Dubois 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/mcimx6ul-evk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/mcimx6ul-evk.c b/hw/arm/mcimx6ul-evk.c
index 77fae874b1..28b4886f48 100644
--- a/hw/arm/mcimx6ul-evk.c
+++ b/hw/arm/mcimx6ul-evk.c
@@ -52,7 +52,7 @@ static void mcimx6ul_evk_init(MachineState *machine)
 DriveInfo *di;
 BlockBackend *blk;
 
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, i);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(&s->usdhc[i]), "sd-bus");
 carddev = qdev_new(TYPE_SD_CARD);
-- 
2.31.1




[PATCH v2 07/13] hw/arm/mcimx7d-sabre: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

Machine "mcimx7d-sabre" connects backends with drive_get_next() in a
counting loop.  Change it to use drive_get() directly.  This makes the
unit numbers explicit in the code.

Cc: Peter Maydell 
Cc: Andrey Smirnov 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/mcimx7d-sabre.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c
index 935d4b0f1c..50a5ecde31 100644
--- a/hw/arm/mcimx7d-sabre.c
+++ b/hw/arm/mcimx7d-sabre.c
@@ -52,7 +52,7 @@ static void mcimx7d_sabre_init(MachineState *machine)
 DriveInfo *di;
 BlockBackend *blk;
 
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, i);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(&s->usdhc[i]), "sd-bus");
 carddev = qdev_new(TYPE_SD_CARD);
-- 
2.31.1




[PATCH v2 02/13] hw: Replace trivial drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

A number of machines connect just one backend with drive_get_next().
Change them to use drive_get() directly.  This makes the (zero) unit
number explicit in the code.

Cc: Beniamino Galvani 
Cc: Peter Maydell 
Cc: Subbaraya Sundeep 
Cc: Niek Linnenbank 
Cc: Andrew Baumann 
Cc: "Philippe Mathieu-Daudé" 
Cc: Jean-Christophe Dubois 
Cc: Alistair Francis 
Cc: Bin Meng 
Cc: Palmer Dabbelt 
Cc: Artyom Tarasenko 
Cc: Mark Cave-Ayland 
Cc: Michael Tokarev 
Cc: Laurent Vivier 
Cc: qemu-...@nongnu.org
Cc: qemu-ri...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/cubieboard.c| 2 +-
 hw/arm/integratorcp.c  | 2 +-
 hw/arm/msf2-som.c  | 2 +-
 hw/arm/orangepi.c  | 2 +-
 hw/arm/raspi.c | 2 +-
 hw/arm/realview.c  | 2 +-
 hw/arm/sabrelite.c | 2 +-
 hw/misc/sifive_u_otp.c | 2 +-
 hw/riscv/microchip_pfsoc.c | 2 +-
 hw/riscv/sifive_u.c| 2 +-
 hw/sparc64/niagara.c   | 2 +-
 11 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index 294ba5de6e..5e3372a3c7 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -81,7 +81,7 @@ static void cubieboard_init(MachineState *machine)
 }
 
 /* Retrieve SD bus */
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, 0);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(a10), "sd-bus");
 
diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
index 16e8985953..b109ece3ae 100644
--- a/hw/arm/integratorcp.c
+++ b/hw/arm/integratorcp.c
@@ -649,7 +649,7 @@ static void integratorcp_init(MachineState *machine)
   qdev_get_gpio_in_named(icp, ICP_GPIO_MMC_WPROT, 0));
 qdev_connect_gpio_out_named(dev, "card-inserted", 0,
   qdev_get_gpio_in_named(icp, ICP_GPIO_MMC_CARDIN, 0));
-dinfo = drive_get_next(IF_SD);
+dinfo = drive_get(IF_SD, 0, 0);
 if (dinfo) {
 DeviceState *card;
 
diff --git a/hw/arm/msf2-som.c b/hw/arm/msf2-som.c
index 396e8b9913..d9f881690e 100644
--- a/hw/arm/msf2-som.c
+++ b/hw/arm/msf2-som.c
@@ -45,7 +45,7 @@ static void emcraft_sf2_s2s010_init(MachineState *machine)
 DeviceState *spi_flash;
 MSF2State *soc;
 MachineClass *mc = MACHINE_GET_CLASS(machine);
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, 0);
 qemu_irq cs_line;
 BusState *spi_bus;
 MemoryRegion *sysmem = get_system_memory();
diff --git a/hw/arm/orangepi.c b/hw/arm/orangepi.c
index 0cf9895ce7..e796382236 100644
--- a/hw/arm/orangepi.c
+++ b/hw/arm/orangepi.c
@@ -85,7 +85,7 @@ static void orangepi_init(MachineState *machine)
 qdev_realize(DEVICE(h3), NULL, &error_abort);
 
 /* Retrieve SD bus */
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, 0);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(h3), "sd-bus");
 
diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
index 146d35382b..b4dd6c1e99 100644
--- a/hw/arm/raspi.c
+++ b/hw/arm/raspi.c
@@ -284,7 +284,7 @@ static void raspi_machine_init(MachineState *machine)
 qdev_realize(DEVICE(&s->soc), NULL, &error_fatal);
 
 /* Create and plug in the SD cards */
-di = drive_get_next(IF_SD);
+di = drive_get(IF_SD, 0, 0);
 blk = di ? blk_by_legacy_dinfo(di) : NULL;
 bus = qdev_get_child_bus(DEVICE(&s->soc), "sd-bus");
 if (bus == NULL) {
diff --git a/hw/arm/realview.c b/hw/arm/realview.c
index 1c54316ba3..ddc70b54a5 100644
--- a/hw/arm/realview.c
+++ b/hw/arm/realview.c
@@ -237,7 +237,7 @@ static void realview_init(MachineState *machine,
 qemu_irq_invert(qdev_get_gpio_in(gpio2, 0)));
 qdev_connect_gpio_out_named(dev, "card-read-only", 0, mmc_irq[0]);
 qdev_connect_gpio_out_named(dev, "card-inserted", 0, mmc_irq[1]);
-dinfo = drive_get_next(IF_SD);
+dinfo = drive_get(IF_SD, 0, 0);
 if (dinfo) {
 DeviceState *card;
 
diff --git a/hw/arm/sabrelite.c b/hw/arm/sabrelite.c
index 553608e583..cce49aa25c 100644
--- a/hw/arm/sabrelite.c
+++ b/hw/arm/sabrelite.c
@@ -76,7 +76,7 @@ static void sabrelite_init(MachineState *machine)
 if (spi_bus) {
 DeviceState *flash_dev;
 qemu_irq cs_line;
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, 0);
 
 flash_dev = qdev_new("sst25vf016b");
 if (dinfo) {
diff --git a/hw/misc/sifive_u_otp.c b/hw/misc/sifive_u_otp.c
index 18aa0bd55d..5d5a8c8a90 100644
--- a/hw/misc/sifive_u_otp.c
+++ b/hw/m

[PATCH v2 12/13] hw/arm/aspeed: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
drive_get_next() is basically a bad idea.  It returns the "next" block
backend of a certain interface type.  "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.

This lets you define unit numbers implicitly by execution order.  If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break.  Hard to spot in review.

The aspeed machines connects backends with drive_get_next() in several
counting loops, one of them in a helper function, and a conditional.
Change it to use drive_get() directly.  This makes the unit numbers
explicit in the code.

Cc: "Cédric Le Goater" 
Cc: Peter Maydell 
Cc: Andrew Jeffery 
Cc: Joel Stanley 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/aspeed.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index a77f46b3ad..cf20ae0db5 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -284,12 +284,13 @@ static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, 
size_t rom_size,
 }
 
 static void aspeed_board_init_flashes(AspeedSMCState *s,
-  const char *flashtype)
+  const char *flashtype,
+  int unit0)
 {
 int i ;
 
 for (i = 0; i < s->num_cs; ++i) {
-DriveInfo *dinfo = drive_get_next(IF_MTD);
+DriveInfo *dinfo = drive_get(IF_MTD, 0, unit0 + i);
 qemu_irq cs_line;
 DeviceState *dev;
 
@@ -382,10 +383,12 @@ static void aspeed_machine_init(MachineState *machine)
   "max_ram", max_ram_size  - machine->ram_size);
 memory_region_add_subregion(&bmc->ram_container, machine->ram_size, 
&bmc->max_ram);
 
-aspeed_board_init_flashes(&bmc->soc.fmc, bmc->fmc_model ?
-  bmc->fmc_model : amc->fmc_model);
-aspeed_board_init_flashes(&bmc->soc.spi[0], bmc->spi_model ?
-  bmc->spi_model : amc->spi_model);
+aspeed_board_init_flashes(&bmc->soc.fmc,
+  bmc->fmc_model ? bmc->fmc_model : amc->fmc_model,
+  0);
+aspeed_board_init_flashes(&bmc->soc.spi[0],
+  bmc->spi_model ? bmc->spi_model : amc->spi_model,
+  bmc->soc.fmc.num_cs);
 
 /* Install first FMC flash content as a boot rom. */
 if (drive0) {
@@ -435,11 +438,13 @@ static void aspeed_machine_init(MachineState *machine)
 }
 
 for (i = 0; i < bmc->soc.sdhci.num_slots; i++) {
-sdhci_attach_drive(&bmc->soc.sdhci.slots[i], drive_get_next(IF_SD));
+sdhci_attach_drive(&bmc->soc.sdhci.slots[i],
+   drive_get(IF_SD, 0, i));
 }
 
 if (bmc->soc.emmc.num_slots) {
-sdhci_attach_drive(&bmc->soc.emmc.slots[0], drive_get_next(IF_SD));
+sdhci_attach_drive(&bmc->soc.emmc.slots[0],
+   drive_get(IF_SD, 0, bmc->soc.sdhci.num_slots));
 }
 
 arm_load_kernel(ARM_CPU(first_cpu), machine, &aspeed_board_binfo);
-- 
2.31.1




[PATCH for-6.2 1/2] nbd/server: Don't complain on certain client disconnects

2021-11-17 Thread Eric Blake
When a client disconnects abruptly, but did not have any pending
requests (for example, when using nbdsh without calling h.shutdown),
we used to output the following message:

$ qemu-nbd -f raw file
$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected 
end-of-file before all bytes were read

Then in commit f148ae7, we refactored nbd_receive_request() to use
nbd_read_eof(); when this returns 0, we regressed into tracing
uninitialized memory (if tracing is enabled) and reporting a
less-specific:

qemu-nbd: Disconnect client, due to: Request handling failed in intermediate 
state

Note that with Unix sockets, we have yet another error message,
unchanged by the 6.0 regression:

$ qemu-nbd -k /tmp/sock -f raw file
$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to 
socket: Broken pipe

But in all cases, the error message goes away if the client performs a
soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by
abrupt disconnect:

$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()'

This patch fixes things to avoid uninitialized memory, and in general
avoids warning about a client that does a hard shutdown when not in
the middle of a packet.  A client that aborts mid-request, or which
does not read the full server's reply, can still result in warnings,
but those are indeed much more unusual situations.

CC: qemu-sta...@nongnu.org
Fixes: f148ae7d36 (nbd/server: Quiesce coroutines on context switch)
Signed-off-by: Eric Blake 
---
 nbd/server.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index d9164ee6d0da..85877f630533 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1418,6 +1418,9 @@ static int nbd_receive_request(NBDClient *client, 
NBDRequest *request,
 if (ret < 0) {
 return ret;
 }
+if (ret == 0) {
+return -EIO;
+}

 /* Request
[ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
@@ -2285,7 +2288,7 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
 assert(client->recv_coroutine == qemu_coroutine_self());
 ret = nbd_receive_request(client, request, errp);
 if (ret < 0) {
-return  ret;
+return ret;
 }

 trace_nbd_co_receive_request_decode_type(request->handle, request->type,
@@ -2662,7 +2665,7 @@ static coroutine_fn void nbd_trip(void *opaque)
 }

 if (ret < 0) {
-/* It wans't -EIO, so, according to nbd_co_receive_request()
+/* It wasn't -EIO, so, according to nbd_co_receive_request()
  * semantics, we should return the error to the client. */
 Error *export_err = local_err;

-- 
2.33.1




Re: [PATCH v2 03/13] hw/arm/npcm7xx_boards: Replace drive_get_next() by drive_get()

2021-11-17 Thread Havard Skinnemoen
On Wed, Nov 17, 2021 at 8:34 AM Markus Armbruster  wrote:
>
> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If the
> order changes, or new calls appear "in the middle", unit numbers change.
> ABI break.  Hard to spot in review.
>
> Machine "quanta-gbs-bmc" connects just one backend with
> drive_get_next(), but with a helper function.  Change it to use
> drive_get() directly.  This makes the unit numbers explicit in the
> code.
>
> Cc: Havard Skinnemoen 
> Cc: Tyrone Ting 
> Cc: Peter Maydell 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 
> ---
>  hw/arm/npcm7xx_boards.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
> index dec7d16ae5..d8a49e4e85 100644
> --- a/hw/arm/npcm7xx_boards.c
> +++ b/hw/arm/npcm7xx_boards.c
> @@ -84,9 +84,9 @@ static void npcm7xx_connect_dram(NPCM7xxState *soc, 
> MemoryRegion *dram)
>   &error_abort);
>  }
>
> -static void sdhci_attach_drive(SDHCIState *sdhci)
> +static void sdhci_attach_drive(SDHCIState *sdhci, int unit)
>  {
> -DriveInfo *di = drive_get_next(IF_SD);
> +DriveInfo *di = drive_get(IF_SD, 0, unit);

+Hao Wu IIRC the chip has separate SD and eMMC buses. Would it make
sense to take the bus number as a parameter as well? Is bus 0 the
right one to use in this case?

The existing code always uses bus 0, so this is an improvement either way.

Reviewed-by: Havard Skinnemoen 

>  BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
>
>  BusState *bus = qdev_get_child_bus(DEVICE(sdhci), "sd-bus");
> @@ -374,7 +374,7 @@ static void quanta_gbs_init(MachineState *machine)
>drive_get(IF_MTD, 0, 0));
>
>  quanta_gbs_i2c_init(soc);
> -sdhci_attach_drive(&soc->mmc.sdhci);
> +sdhci_attach_drive(&soc->mmc.sdhci, 0);
>  npcm7xx_load_kernel(machine, soc);
>  }
>
> --
> 2.31.1
>



[PATCH for-6.2? 2/2] nbd/server: Simplify zero and trim

2021-11-17 Thread Eric Blake
Now that the block layer supports 64-bit operations, we no longer have
to self-fragment requests larger than 2G, reverting the workaround
added in 890cbccb08 (nbd: Fix large trim/zero requests).

Signed-off-by: Eric Blake 
---
 nbd/server.c | 23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 85877f630533..1b3945220bd3 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2509,16 +2509,8 @@ static coroutine_fn int nbd_handle_request(NBDClient 
*client,
 if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
 flags |= BDRV_REQ_NO_FALLBACK;
 }
-ret = 0;
-/* FIXME simplify this when blk_pwrite_zeroes switches to 64-bit */
-while (ret >= 0 && request->len) {
-int align = client->check_align ?: 1;
-int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
-align));
-ret = blk_pwrite_zeroes(exp->common.blk, request->from, len, 
flags);
-request->len -= len;
-request->from += len;
-}
+ret = blk_pwrite_zeroes(exp->common.blk, request->from, request->len,
+flags);
 return nbd_send_generic_reply(client, request->handle, ret,
   "writing to file failed", errp);

@@ -2532,16 +2524,7 @@ static coroutine_fn int nbd_handle_request(NBDClient 
*client,
   "flush failed", errp);

 case NBD_CMD_TRIM:
-ret = 0;
-/* FIXME simplify this when blk_co_pdiscard switches to 64-bit */
-while (ret >= 0 && request->len) {
-int align = client->check_align ?: 1;
-int len = MIN(request->len, QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
-align));
-ret = blk_co_pdiscard(exp->common.blk, request->from, len);
-request->len -= len;
-request->from += len;
-}
+ret = blk_co_pdiscard(exp->common.blk, request->from, request->len);
 if (ret >= 0 && request->flags & NBD_CMD_FLAG_FUA) {
 ret = blk_co_flush(exp->common.blk);
 }
-- 
2.33.1




[PATCH for-6.2 0/2] NBD 6.2-rc fixes

2021-11-17 Thread Eric Blake
Back in September, Rich proposed a patch to silence an EPIPE message
from qemu-nbd when used with Unix sockets:
https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg03359.html

But investigating that further, I found that we had a different
message with TCP sockets, and that we regressed in qemu 6.0 with
regards to the message we print due to the use of uninitialized
memory.  Fixing the uninitialized memory use happens to also silence
the message that Rich was seeing, but without needing to special-case
EPIPE.

I also noticed that even though commit 2800637a and friends made the
block layer support 64-bit zero/trim, we are still manually splitting
3G requests in the NBD driver.  Patch 2 fixes that, although I'm less
certain whether it counts as 6.2-rc material since it is merely a
minor performance tweak to a feature new to 6.2, rather than a
regression fix.

Eric Blake (2):
  nbd/server: Don't complain on certain client disconnects
  nbd/server: Simplify zero and trim

 nbd/server.c | 30 --
 1 file changed, 8 insertions(+), 22 deletions(-)

-- 
2.33.1




Re: [PATCH v2 03/13] hw/arm/npcm7xx_boards: Replace drive_get_next() by drive_get()

2021-11-17 Thread Hao Wu
Yes, there's SD and MMC buses. It looks like the current code only supports
mmc ("soc->mmc.sdhci") but not the sd ("soc->sd.sdhci").

It's probably good to make the bus number a parameter as well and use them
to distinguish. We might need a separate patch to do that.

On Wed, Nov 17, 2021 at 8:54 AM Havard Skinnemoen 
wrote:

> On Wed, Nov 17, 2021 at 8:34 AM Markus Armbruster 
> wrote:
> >
> > drive_get_next() is basically a bad idea.  It returns the "next" block
> > backend of a certain interface type.  "Next" means bus=0,unit=N, where
> > subsequent calls count N up from zero, per interface type.
> >
> > This lets you define unit numbers implicitly by execution order.  If the
> > order changes, or new calls appear "in the middle", unit numbers change.
> > ABI break.  Hard to spot in review.
> >
> > Machine "quanta-gbs-bmc" connects just one backend with
> > drive_get_next(), but with a helper function.  Change it to use
> > drive_get() directly.  This makes the unit numbers explicit in the
> > code.
> >
> > Cc: Havard Skinnemoen 
> > Cc: Tyrone Ting 
> > Cc: Peter Maydell 
> > Cc: qemu-...@nongnu.org
> > Signed-off-by: Markus Armbruster 
> > ---
> >  hw/arm/npcm7xx_boards.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
> > index dec7d16ae5..d8a49e4e85 100644
> > --- a/hw/arm/npcm7xx_boards.c
> > +++ b/hw/arm/npcm7xx_boards.c
> > @@ -84,9 +84,9 @@ static void npcm7xx_connect_dram(NPCM7xxState *soc,
> MemoryRegion *dram)
> >   &error_abort);
> >  }
> >
> > -static void sdhci_attach_drive(SDHCIState *sdhci)
> > +static void sdhci_attach_drive(SDHCIState *sdhci, int unit)
> >  {
> > -DriveInfo *di = drive_get_next(IF_SD);
> > +DriveInfo *di = drive_get(IF_SD, 0, unit);
>
> +Hao Wu IIRC the chip has separate SD and eMMC buses. Would it make
> sense to take the bus number as a parameter as well? Is bus 0 the
> right one to use in this case?
>
> The existing code always uses bus 0, so this is an improvement either way.
>
> Reviewed-by: Havard Skinnemoen 
>
> >  BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
> >
> >  BusState *bus = qdev_get_child_bus(DEVICE(sdhci), "sd-bus");
> > @@ -374,7 +374,7 @@ static void quanta_gbs_init(MachineState *machine)
> >drive_get(IF_MTD, 0, 0));
> >
> >  quanta_gbs_i2c_init(soc);
> > -sdhci_attach_drive(&soc->mmc.sdhci);
> > +sdhci_attach_drive(&soc->mmc.sdhci, 0);
> >  npcm7xx_load_kernel(machine, soc);
> >  }
> >
> > --
> > 2.31.1
> >
>


Re: [PULL 0/5] Python patches

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 4:42 AM Gerd Hoffmann  wrote:

>   Hi,
>
> >   https://gitlab.com/jsnow/qemu.git tags/python-pull-request
>
> What is the status of the plan to upload this to pypi eventually?
>
>
Thanks for asking!

The honest answer is "I'm not exactly sure", but there are a few things to
work out still. Let me use this as an opportunity to try and give you an
honest answer.
We've got four packages right now: qmp, aqmp, machine and utils.

- I don't intend to *ever* upload utils, I created that one specifically as
an in-tree package for "low quality" code that we just need as glue.
- aqmp is brand new. It was moved as the default provider for the QMP
protocol in the tree (being used by machine.py) only two weeks ago. I am
using this current RC testing phase to find any problems with it.
- qmp is something I want to deprecate, I don't intend to upload it to
PyPI. I intend to rename aqmp -> qmp and have just the one qmp package. I
can't do this until next release, and only after we are confident and happy
that aqmp is stable enough.
- machine has a few problems with it. I am reluctant to upload it in its
current form. I am actively developing a new version of it that uses the
new Async QMP module. However, this might take a bit of time, I fear.

So, I think I have this timeline for myself:

- Fix bugs in AQMP package revealed during RC testing
- Introduce sync wrapper for AQMP that resembles the native AQMP interface
more than it resembles the "legacy QMP" interface.
- Remove all QEMU source tree uses of qemu.qmp and qemu.aqmp.legacy.
- Delete qemu.qmp and rename qemu.aqmp to qemu.qmp.
- Split python/qemu/qmp out into its own repository and begin uploading it
to PyPI, as a test. (Do not delete python/qemu/qmp yet at this phase.)
- Transition any users of the Python packages in the QEMU source tree to
installing the QMP dependency from PyPI instead of grabbing it from the
tree.
- Delete python/qemu/qmp from the QEMU source tree at this moment;
"re-fork" the package if necessary to collect any commits since the "test
split" procedure.


Some questions to work out:
- What tools should be uploaded with qemu.qmp? a version of qmp-shell is
high on the list for me. qom, qom-set, qom-get, qom-list, qom-tree,
qom-fuse etc I am suspecting might be better left behind in qemu.utils
instead, though. I am not sure I want to support those more broadly. They
weren't designed for "external consumption".
- qemu-ga-client should be moved over into utils, or possibly even deleted
-- it hasn't seen a lot of love and I doubt there are any users. I don't
have the bandwidth to refurbish it for no users. Maybe if there's a demand
in the future ...


... This might be being overcautious, though. Perhaps I can upload a
version of "qemu.aqmp" even this week just as a demonstration of how it
would work.

Happy to take suggestions/feedback on this process.

--js


Re: [PATCH for-6.2 1/2] nbd/server: Don't complain on certain client disconnects

2021-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2021 20:02, Eric Blake wrote:

When a client disconnects abruptly, but did not have any pending
requests (for example, when using nbdsh without calling h.shutdown),
we used to output the following message:

$ qemu-nbd -f raw file
$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected 
end-of-file before all bytes were read

Then in commit f148ae7, we refactored nbd_receive_request() to use
nbd_read_eof(); when this returns 0, we regressed into tracing
uninitialized memory (if tracing is enabled) and reporting a
less-specific:

qemu-nbd: Disconnect client, due to: Request handling failed in intermediate 
state

Note that with Unix sockets, we have yet another error message,
unchanged by the 6.0 regression:

$ qemu-nbd -k /tmp/sock -f raw file
$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to 
socket: Broken pipe

But in all cases, the error message goes away if the client performs a
soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by
abrupt disconnect:

$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()'

This patch fixes things to avoid uninitialized memory, and in general
avoids warning about a client that does a hard shutdown when not in
the middle of a packet.  A client that aborts mid-request, or which
does not read the full server's reply, can still result in warnings,
but those are indeed much more unusual situations.

CC: qemu-sta...@nongnu.org
Fixes: f148ae7d36 (nbd/server: Quiesce coroutines on context switch)
Signed-off-by: Eric Blake 
---
  nbd/server.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index d9164ee6d0da..85877f630533 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1418,6 +1418,9 @@ static int nbd_receive_request(NBDClient *client, 
NBDRequest *request,
  if (ret < 0) {
  return ret;
  }
+if (ret == 0) {
+return -EIO;
+}


Reviewed-by: Vladimir Sementsov-Ogievskiy 

I'd prefer not include following hunks to the patch, as they are unrelated.



  /* Request
 [ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
@@ -2285,7 +2288,7 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
  assert(client->recv_coroutine == qemu_coroutine_self());
  ret = nbd_receive_request(client, request, errp);
  if (ret < 0) {
-return  ret;
+return ret;
  }

  trace_nbd_co_receive_request_decode_type(request->handle, request->type,
@@ -2662,7 +2665,7 @@ static coroutine_fn void nbd_trip(void *opaque)
  }

  if (ret < 0) {
-/* It wans't -EIO, so, according to nbd_co_receive_request()
+/* It wasn't -EIO, so, according to nbd_co_receive_request()
   * semantics, we should return the error to the client. */
  Error *export_err = local_err;




--
Best regards,
Vladimir



Re: [PATCH for-6.2? 2/2] nbd/server: Simplify zero and trim

2021-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2021 20:02, Eric Blake wrote:

Now that the block layer supports 64-bit operations, we no longer have
to self-fragment requests larger than 2G, reverting the workaround
added in 890cbccb08 (nbd: Fix large trim/zero requests).

Signed-off-by: Eric Blake


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: Failing QEMU iotests

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 7:50 AM Thomas Huth  wrote:

> On 17/11/2021 11.59, Hanna Reitz wrote:
> > On 17.11.21 11:07, Thomas Huth wrote:
>
> >> +++ 297.out.bad
> >> @@ -1,2 +1,21 @@
> >>  === pylint ===
> >> +* Module image-fleecing
> >> +tests/image-fleecing:34:24: C0326: Exactly one space required after
> comma
> >> +patterns = [('0x5d', '0', '64k'),
> >> +^ (bad-whitespace)
> >> +tests/image-fleecing:35:25: C0326: Exactly one space required after
> comma
> >> +('0xd5', '1M','64k'),
> >> + ^ (bad-whitespace)
> >> +tests/image-fleecing:36:26: C0326: Exactly one space required after
> comma
> >> +('0xdc', '32M',   '64k'),
> >> +  ^ (bad-whitespace)
> >> +tests/image-fleecing:39:25: C0326: Exactly one space required after
> comma
> >> +overwrite = [('0xab', '0', '64k'), # Full overwrite
> >> + ^ (bad-whitespace)
> >> +tests/image-fleecing:48:32: C0326: Exactly one space required after
> comma
> >> +remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left
> [1]
> >> +^ (bad-whitespace)
> >> +tests/image-fleecing:49:27: C0326: Exactly one space required after
> comma
> >> + ('0xdc', '32M',   '32k'), # Left-end of partial-right
> [2]
> >> +   ^ (bad-whitespace)
> >
> > This could be because your pylint is too old.  At least for the python/
> > tests we at least require 2.8.0
> > (https://lists.nongnu.org/archive/html/qemu-block/2021-10/msg00768.html)
> and
> > bad-whitespace was removed in 2.6.
>
> Thanks, updating pylint fixed this problem, indeed!
>
> But maybe the iotests should check the pylint version before using it?
>
>
Ideally, yes ...  sorry, it's been a lot of work to try and get the python
testing for this stuff in order.

FWIW, the GitLab CI jobs for check-python-pipenv and check-python-tox now
basically run "iotest 297", and those jobs will use virtual environments to
force a supportable version of pylint/mypy/etc. These targets are the ones
I put the most effort into, and those are the ones that "just work".

It's on my list to, one way or another, drop 297 and use the python testing
infra to cover this instead, but I have some ground to cover for
usability/convenience before I can pitch it.

(At the risk of sounding like I am task offloading, if you send a patch to
add version checking to 297, I can review it.)

--js


Re: Failing QEMU iotests

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 5:07 AM Thomas Huth  wrote:

>
>   Hi!
>
> I think it has been working fine for me a couple of weeks ago,
> but when I now run:
>
>   make check SPEED=slow
>
> I'm getting a couple of failing iotests... not sure whether
> these are known issues already, so I thought I'd summarize them
> here:
>
> *** First one is 045 in raw mode: ***
>
>   TEST   iotest-raw: 045 [fail]
> QEMU  --
> "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64"
> -nodefaults -display none -accel qtest
> QEMU_IMG  --
> "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
> QEMU_IO   --
> "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" --cache
> writeback --aio threads -f raw
> QEMU_NBD  --
> "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
> IMGFMT-- raw
> IMGPROTO  -- file
> PLATFORM  -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
> TEST_DIR  -- /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
> SOCK_DIR  -- /tmp/tmphlexdrlt
> GDB_OPTIONS   --
> VALGRIND_QEMU --
> PRINT_QEMU_OUTPUT --
>
> --- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
> +++ 045.out.bad
> @@ -1,5 +1,77 @@
> -...
> +..EE.EE
> +==
> +ERROR: test_add_fd (__main__.TestSCMFd)
> +--
> +Traceback (most recent call last):
> +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in
> test_add_fd
> +self._send_fd_by_SCM()
> +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in
> _send_fd_by_SCM
> +ret = self.vm.send_fd_scm(file_path=image0)
> +  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line 229,
> in send_fd_scm
> +self._qmp.send_fd_scm(fd)
> +  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138, in
> send_fd_scm
> +self._aqmp.send_fd_scm(fd)
> +  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 149,
> in _wrapper
> +return func(proto, *args, **kwargs)
> +  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line 644,
> in send_fd_scm
> +sock = sock._sock  # pylint: disable=protected-access
> +AttributeError: 'socket' object has no attribute '_sock'
>

Well, that's not good.

Can you tell me some details about what system produced this failure?
The python version used to run the test would be good, as well as distro
release, kernel version, etc.

If you can reproduce it, I might want to give you a test branch of the
python code to produce some extra debugging information to help me
understand what's gone wrong here. Get in touch on IRC when you have some
spare time if you'd like to interactively debug it.

--js


Re: [PULL 0/5] Python patches

2021-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2021 20:56, John Snow wrote:


On Wed, Nov 17, 2021 at 4:42 AM Gerd Hoffmann mailto:kra...@redhat.com>> wrote:

   Hi,

 > https://gitlab.com/jsnow/qemu.git  
tags/python-pull-request

What is the status of the plan to upload this to pypi eventually?


Thanks for asking!

The honest answer is "I'm not exactly sure", but there are a few things to work 
out still. Let me use this as an opportunity to try and give you an honest answer.
We've got four packages right now: qmp, aqmp, machine and utils.

- I don't intend to *ever* upload utils, I created that one specifically as an in-tree 
package for "low quality" code that we just need as glue.
- aqmp is brand new. It was moved as the default provider for the QMP protocol 
in the tree (being used by machine.py) only two weeks ago. I am using this 
current RC testing phase to find any problems with it.
- qmp is something I want to deprecate, I don't intend to upload it to PyPI. I 
intend to rename aqmp -> qmp and have just the one qmp package. I can't do this 
until next release, and only after we are confident and happy that aqmp is stable 
enough.
- machine has a few problems with it. I am reluctant to upload it in its 
current form. I am actively developing a new version of it that uses the new 
Async QMP module. However, this might take a bit of time, I fear.

So, I think I have this timeline for myself:

- Fix bugs in AQMP package revealed during RC testing
- Introduce sync wrapper for AQMP that resembles the native AQMP interface more than it 
resembles the "legacy QMP" interface.
- Remove all QEMU source tree uses of qemu.qmp and qemu.aqmp.legacy.
- Delete qemu.qmp and rename qemu.aqmp to qemu.qmp.
- Split python/qemu/qmp out into its own repository and begin uploading it to 
PyPI, as a test. (Do not delete python/qemu/qmp yet at this phase.)
- Transition any users of the Python packages in the QEMU source tree to 
installing the QMP dependency from PyPI instead of grabbing it from the tree.
- Delete python/qemu/qmp from the QEMU source tree at this moment; "re-fork" the package 
if necessary to collect any commits since the "test split" procedure.



That all sounds great!



Some questions to work out:
- What tools should be uploaded with qemu.qmp? a version of qmp-shell is high on the list 
for me. qom, qom-set, qom-get, qom-list, qom-tree, qom-fuse etc I am suspecting might be 
better left behind in qemu.utils instead, though. I am not sure I want to support those 
more broadly. They weren't designed for "external consumption".
- qemu-ga-client should be moved over into utils, or possibly even deleted -- 
it hasn't seen a lot of love and I doubt there are any users. I don't have the 
bandwidth to refurbish it for no users. Maybe if there's a demand in the future 
...


... This might be being overcautious, though. Perhaps I can upload a version of 
"qemu.aqmp" even this week just as a demonstration of how it would work.



Why do we need wait for next release for renaming aqmp -> qmp? Or what next 
release do you mean? I think you can rename it as soon as 6.3 development phase is 
open.

I'm not sure that's a good idea to upload qemu.aqmp to public and than rename 
it to qemu.qmp.. Maybe, you can upload it now as qemu.qmp? So, first, create a 
separate repo with aqmp (already renamed to qmp), upload it to PyPI (as qmp) - 
this all as a first step. And then gradually move Qemu to use this new repo 
instead its own qmp/aqmp.

--
Best regards,
Vladimir



Re: [PULL 0/5] Python patches

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 1:20 PM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> 17.11.2021 20:56, John Snow wrote:
> >
> > On Wed, Nov 17, 2021 at 4:42 AM Gerd Hoffmann  > wrote:
> >
> >Hi,
> >
> >  > https://gitlab.com/jsnow/qemu.git <
> https://gitlab.com/jsnow/qemu.git> tags/python-pull-request
> >
> > What is the status of the plan to upload this to pypi eventually?
> >
> >
> > Thanks for asking!
> >
> > The honest answer is "I'm not exactly sure", but there are a few things
> to work out still. Let me use this as an opportunity to try and give you an
> honest answer.
> > We've got four packages right now: qmp, aqmp, machine and utils.
> >
> > - I don't intend to *ever* upload utils, I created that one specifically
> as an in-tree package for "low quality" code that we just need as glue.
> > - aqmp is brand new. It was moved as the default provider for the QMP
> protocol in the tree (being used by machine.py) only two weeks ago. I am
> using this current RC testing phase to find any problems with it.
> > - qmp is something I want to deprecate, I don't intend to upload it to
> PyPI. I intend to rename aqmp -> qmp and have just the one qmp package. I
> can't do this until next release, and only after we are confident and happy
> that aqmp is stable enough.
> > - machine has a few problems with it. I am reluctant to upload it in its
> current form. I am actively developing a new version of it that uses the
> new Async QMP module. However, this might take a bit of time, I fear.
> >
> > So, I think I have this timeline for myself:
> >
> > - Fix bugs in AQMP package revealed during RC testing
> > - Introduce sync wrapper for AQMP that resembles the native AQMP
> interface more than it resembles the "legacy QMP" interface.
> > - Remove all QEMU source tree uses of qemu.qmp and qemu.aqmp.legacy.
> > - Delete qemu.qmp and rename qemu.aqmp to qemu.qmp.
> > - Split python/qemu/qmp out into its own repository and begin uploading
> it to PyPI, as a test. (Do not delete python/qemu/qmp yet at this phase.)
> > - Transition any users of the Python packages in the QEMU source tree to
> installing the QMP dependency from PyPI instead of grabbing it from the
> tree.
> > - Delete python/qemu/qmp from the QEMU source tree at this moment;
> "re-fork" the package if necessary to collect any commits since the "test
> split" procedure.
> >
>
> That all sounds great!
>
> >
> > Some questions to work out:
> > - What tools should be uploaded with qemu.qmp? a version of qmp-shell is
> high on the list for me. qom, qom-set, qom-get, qom-list, qom-tree,
> qom-fuse etc I am suspecting might be better left behind in qemu.utils
> instead, though. I am not sure I want to support those more broadly. They
> weren't designed for "external consumption".
> > - qemu-ga-client should be moved over into utils, or possibly even
> deleted -- it hasn't seen a lot of love and I doubt there are any users. I
> don't have the bandwidth to refurbish it for no users. Maybe if there's a
> demand in the future ...
> >
> >
> > ... This might be being overcautious, though. Perhaps I can upload a
> version of "qemu.aqmp" even this week just as a demonstration of how it
> would work.
> >
>
> Why do we need wait for next release for renaming aqmp -> qmp? Or what
> next release do you mean? I think you can rename it as soon as 6.3
> development phase is open.
>
>
I might be confused in my thinking because there's a ton of little tasks to
do, and I won't pretend I have thought extremely carefully about the
precise order in which they *have* to be done, only the order in which that
it occurs to me to do them. :)

I suppose I could do something like rename "qmp" to "legacy_qmp" in the
tree as an intermediate step and accomplish the "aqmp -> qmp" rename
sooner, but there's a lot of churn inherent to that. Since there's a lot of
churn inherent to moving users off of the old interface anyway, I figured
I'd just tackle it all at once... which I can't do until the tree re-opens
again.

I can certainly work on it in the meantime, though.

I'm not sure that's a good idea to upload qemu.aqmp to public and than
> rename it to qemu.qmp.. Maybe, you can upload it now as qemu.qmp? So,
> first, create a separate repo with aqmp (already renamed to qmp), upload it
> to PyPI (as qmp) - this all as a first step. And then gradually move Qemu
> to use this new repo instead its own qmp/aqmp.
>

I'm afraid of doing this because I don't want to pollute the 'qemu.qmp'
space with two packages that contain radically different things in it. It
feels safer to fully switch over the QEMU source tree first, and THEN
upload to PyPI. If I go out of order there, I worry that we will run into
circumstances where various scripts/tools will use "the wrong qemu.qmp",
and it will be frustrating for people without a lot of Python packaging
experience to diagnose on their own.

The safest thing is just to wait for me to do all the cleanup chur

Re: [PULL 0/5] Python patches

2021-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2021 22:07, John Snow wrote:



On Wed, Nov 17, 2021 at 1:20 PM Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> wrote:

17.11.2021 20:56, John Snow wrote:
 >
 > On Wed, Nov 17, 2021 at 4:42 AM Gerd Hoffmann mailto:kra...@redhat.com> >> 
wrote:
 >
 >        Hi,
 >
 >      > https://gitlab.com/jsnow/qemu.git  
> 
tags/python-pull-request
 >
 >     What is the status of the plan to upload this to pypi eventually?
 >
 >
 > Thanks for asking!
 >
 > The honest answer is "I'm not exactly sure", but there are a few things 
to work out still. Let me use this as an opportunity to try and give you an honest answer.
 > We've got four packages right now: qmp, aqmp, machine and utils.
 >
 > - I don't intend to *ever* upload utils, I created that one specifically as an 
in-tree package for "low quality" code that we just need as glue.
 > - aqmp is brand new. It was moved as the default provider for the QMP 
protocol in the tree (being used by machine.py) only two weeks ago. I am using 
this current RC testing phase to find any problems with it.
 > - qmp is something I want to deprecate, I don't intend to upload it to 
PyPI. I intend to rename aqmp -> qmp and have just the one qmp package. I can't do 
this until next release, and only after we are confident and happy that aqmp is 
stable enough.
 > - machine has a few problems with it. I am reluctant to upload it in its 
current form. I am actively developing a new version of it that uses the new Async 
QMP module. However, this might take a bit of time, I fear.
 >
 > So, I think I have this timeline for myself:
 >
 > - Fix bugs in AQMP package revealed during RC testing
 > - Introduce sync wrapper for AQMP that resembles the native AQMP interface more 
than it resembles the "legacy QMP" interface.
 > - Remove all QEMU source tree uses of qemu.qmp and qemu.aqmp.legacy.
 > - Delete qemu.qmp and rename qemu.aqmp to qemu.qmp.
 > - Split python/qemu/qmp out into its own repository and begin uploading 
it to PyPI, as a test. (Do not delete python/qemu/qmp yet at this phase.)
 > - Transition any users of the Python packages in the QEMU source tree to 
installing the QMP dependency from PyPI instead of grabbing it from the tree.
 > - Delete python/qemu/qmp from the QEMU source tree at this moment; "re-fork" the 
package if necessary to collect any commits since the "test split" procedure.
 >

That all sounds great!

 >
 > Some questions to work out:
 > - What tools should be uploaded with qemu.qmp? a version of qmp-shell is high on 
the list for me. qom, qom-set, qom-get, qom-list, qom-tree, qom-fuse etc I am suspecting 
might be better left behind in qemu.utils instead, though. I am not sure I want to support 
those more broadly. They weren't designed for "external consumption".
 > - qemu-ga-client should be moved over into utils, or possibly even 
deleted -- it hasn't seen a lot of love and I doubt there are any users. I don't 
have the bandwidth to refurbish it for no users. Maybe if there's a demand in the 
future ...
 >
 >
 > ... This might be being overcautious, though. Perhaps I can upload a version of 
"qemu.aqmp" even this week just as a demonstration of how it would work.
 >

Why do we need wait for next release for renaming aqmp -> qmp? Or what next 
release do you mean? I think you can rename it as soon as 6.3 development phase is 
open.


I might be confused in my thinking because there's a ton of little tasks to do, 
and I won't pretend I have thought extremely carefully about the precise order 
in which they *have* to be done, only the order in which that it occurs to me 
to do them. :)

I suppose I could do something like rename "qmp" to "legacy_qmp" in the tree as an 
intermediate step and accomplish the "aqmp -> qmp" rename sooner, but there's a lot of churn 
inherent to that. Since there's a lot of churn inherent to moving users off of the old interface anyway, I 
figured I'd just tackle it all at once... which I can't do until the tree re-opens again.

I can certainly work on it in the meantime, though.

I'm not sure that's a good idea to upload qemu.aqmp to public and than 
rename it to qemu.qmp.. Maybe, you can upload it now as qemu.qmp? So, first, 
create a separate repo with aqmp (already renamed to qmp), upload it to PyPI 
(as qmp) - this all as a first step. And then gradually move Qemu to use this 
new repo instead its own qmp/aqmp.


I'm afraid of doing this because I don't want to pollute the 'qemu.qmp' space with two 
packages that contain radically different things in it. It feels safer to fully switch 
over the QEMU source tree first, and THEN upload to PyPI. If I go out of order there, I 
worry that we wil

Re: [PATCH v2 01/13] hw/sd/ssi-sd: Do not create SD card within controller's realize

2021-11-17 Thread Philippe Mathieu-Daudé
Hi Markus, Peter,

On 11/17/21 17:33, Markus Armbruster wrote:
> ssi_sd_realize() creates an "sd-card" device.  This is inappropriate,
> and marked FIXME.
> 
> Move it to the boards that create these devices.  Prior art: commit
> eb4f566bbb for device "generic-sdhci", and commit 26c607b86b for
> device "pl181".
> 
> The device remains not user-creatable, because its users should (and
> do) wire up its GPIO chip-select line.
> 
> Cc: Peter Maydell 
> Cc: Alistair Francis 
> Cc: Bin Meng 
> Cc: Palmer Dabbelt 
> Cc: "Philippe Mathieu-Daudé" 
> Cc: qemu-...@nongnu.org
> Cc: qemu-ri...@nongnu.org
> Signed-off-by: Markus Armbruster 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/arm/stellaris.c  | 15 ++-
>  hw/riscv/sifive_u.c | 13 -
>  hw/sd/ssi-sd.c  | 29 +
>  3 files changed, 27 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
> index 78827ace6b..b6c8a5d609 100644
> --- a/hw/arm/stellaris.c
> +++ b/hw/arm/stellaris.c
> @@ -10,6 +10,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "hw/sysbus.h"
> +#include "hw/sd/sd.h"
>  #include "hw/ssi/ssi.h"
>  #include "hw/arm/boot.h"
>  #include "qemu/timer.h"
> @@ -1157,6 +1158,9 @@ static void stellaris_init(MachineState *ms, 
> stellaris_board_info *board)
>  void *bus;
>  DeviceState *sddev;
>  DeviceState *ssddev;
> +DriveInfo *dinfo;
> +DeviceState *carddev;
> +BlockBackend *blk;
>  
>  /*
>   * Some boards have both an OLED controller and SD card 
> connected to
> @@ -1221,8 +1225,17 @@ static void stellaris_init(MachineState *ms, 
> stellaris_board_info *board)
>   *  - Make the ssd0323 OLED controller chipselect active-low
>   */
>  bus = qdev_get_child_bus(dev, "ssi");
> -
>  sddev = ssi_create_peripheral(bus, "ssi-sd");
> +
> +dinfo = drive_get(IF_SD, 0, 0);
> +blk = dinfo ? blk_by_legacy_dinfo(dinfo) : NULL;
> +carddev = qdev_new(TYPE_SD_CARD);
> +qdev_prop_set_drive_err(carddev, "drive", blk, &error_fatal);

I guess a already asked this once but don't remember now... What is the
point of creating a SD card without drive? Is this due to the old design
issue we hotplug the drive to the SD card and not the SD card on the SD
bus?

> +qdev_prop_set_bit(carddev, "spi", true);
> +qdev_realize_and_unref(carddev,
> +   qdev_get_child_bus(sddev, "sd-bus"),
> +   &error_fatal);
> +



Re: Failing QEMU iotests

2021-11-17 Thread Thomas Huth

On 17/11/2021 19.13, John Snow wrote:



On Wed, Nov 17, 2021 at 5:07 AM Thomas Huth > wrote:



   Hi!

I think it has been working fine for me a couple of weeks ago,
but when I now run:

   make check SPEED=slow

I'm getting a couple of failing iotests... not sure whether
these are known issues already, so I thought I'd summarize them
here:

*** First one is 045 in raw mode: ***

   TEST   iotest-raw: 045 [fail]
QEMU          --
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64"
-nodefaults -display none -accel qtest
QEMU_IMG      --
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
QEMU_IO       --
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" --cache
writeback --aio threads -f raw
QEMU_NBD      --
"/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
IMGFMT        -- raw
IMGPROTO      -- file
PLATFORM      -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
TEST_DIR      -- /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR      -- /tmp/tmphlexdrlt
GDB_OPTIONS   --
VALGRIND_QEMU --
PRINT_QEMU_OUTPUT --

--- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
+++ 045.out.bad
@@ -1,5 +1,77 @@
-...
+..EE.EE 
+==
+ERROR: test_add_fd (__main__.TestSCMFd)
+--
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in
test_add_fd
+    self._send_fd_by_SCM()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in
_send_fd_by_SCM
+    ret = self.vm.send_fd_scm(file_path=image0)
+  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line
229, in send_fd_scm
+    self._qmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line 138,
in send_fd_scm
+    self._aqmp.send_fd_scm(fd)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line 149,
in _wrapper
+    return func(proto, *args, **kwargs)
+  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line
644, in send_fd_scm
+    sock = sock._sock  # pylint: disable=protected-access
+AttributeError: 'socket' object has no attribute '_sock'


Well, that's not good.

Can you tell me some details about what system produced this failure?
The python version used to run the test would be good, as well as distro 
release, kernel version, etc.


If you can reproduce it, I might want to give you a test branch of the 
python code to produce some extra debugging information to help me 
understand what's gone wrong here. Get in touch on IRC when you have some 
spare time if you'd like to interactively debug it.


As you likely saw in Hanna's mail a little bit later, the problem was the 
old version of pylint. I did still have version 2.2 installed - after 
upgrading, the problem went away.


 Thomas




Re: Failing QEMU iotests

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 2:45 PM Thomas Huth  wrote:

> On 17/11/2021 19.13, John Snow wrote:
> >
> >
> > On Wed, Nov 17, 2021 at 5:07 AM Thomas Huth  > > wrote:
> >
> >
> >Hi!
> >
> > I think it has been working fine for me a couple of weeks ago,
> > but when I now run:
> >
> >make check SPEED=slow
> >
> > I'm getting a couple of failing iotests... not sure whether
> > these are known issues already, so I thought I'd summarize them
> > here:
> >
> > *** First one is 045 in raw mode: ***
> >
> >TEST   iotest-raw: 045 [fail]
> > QEMU  --
> >
>  "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64"
> > -nodefaults -display none -accel qtest
> > QEMU_IMG  --
> > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
> > QEMU_IO   --
> > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" --cache
> > writeback --aio threads -f raw
> > QEMU_NBD  --
> > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
> > IMGFMT-- raw
> > IMGPROTO  -- file
> > PLATFORM  -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
> > TEST_DIR  --
> /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
> > SOCK_DIR  -- /tmp/tmphlexdrlt
> > GDB_OPTIONS   --
> > VALGRIND_QEMU --
> > PRINT_QEMU_OUTPUT --
> >
> > --- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
> > +++ 045.out.bad
> > @@ -1,5 +1,77 @@
> > -...
> > +..EE.EE 
> >
>  +==
> > +ERROR: test_add_fd (__main__.TestSCMFd)
> >
>  +--
> > +Traceback (most recent call last):
> > +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in
> > test_add_fd
> > +self._send_fd_by_SCM()
> > +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in
> > _send_fd_by_SCM
> > +ret = self.vm.send_fd_scm(file_path=image0)
> > +  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line
> > 229, in send_fd_scm
> > +self._qmp.send_fd_scm(fd)
> > +  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line
> 138,
> > in send_fd_scm
> > +self._aqmp.send_fd_scm(fd)
> > +  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py", line
> 149,
> > in _wrapper
> > +return func(proto, *args, **kwargs)
> > +  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line
> > 644, in send_fd_scm
> > +sock = sock._sock  # pylint: disable=protected-access
> > +AttributeError: 'socket' object has no attribute '_sock'
> >
> >
> > Well, that's not good.
> >
> > Can you tell me some details about what system produced this failure?
> > The python version used to run the test would be good, as well as distro
> > release, kernel version, etc.
> >
> > If you can reproduce it, I might want to give you a test branch of the
> > python code to produce some extra debugging information to help me
> > understand what's gone wrong here. Get in touch on IRC when you have
> some
> > spare time if you'd like to interactively debug it.
>
> As you likely saw in Hanna's mail a little bit later, the problem was the
> old version of pylint. I did still have version 2.2 installed - after
> upgrading, the problem went away.
>
>
upgrading pylint made *this* problem in *045* go away and not just the
failure in *297*, are you positive?


Re: [PATCH for-6.2 1/2] nbd/server: Don't complain on certain client disconnects

2021-11-17 Thread Eric Blake
On Wed, Nov 17, 2021 at 08:57:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 17.11.2021 20:02, Eric Blake wrote:

> > This patch fixes things to avoid uninitialized memory, and in general
> > avoids warning about a client that does a hard shutdown when not in
> > the middle of a packet.  A client that aborts mid-request, or which
> > does not read the full server's reply, can still result in warnings,
> > but those are indeed much more unusual situations.
> > 
> > CC: qemu-sta...@nongnu.org
> > Fixes: f148ae7d36 (nbd/server: Quiesce coroutines on context switch)
> > Signed-off-by: Eric Blake 
> > ---
> >   nbd/server.c | 7 +--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/nbd/server.c b/nbd/server.c
> > index d9164ee6d0da..85877f630533 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -1418,6 +1418,9 @@ static int nbd_receive_request(NBDClient *client, 
> > NBDRequest *request,
> >   if (ret < 0) {
> >   return ret;
> >   }
> > +if (ret == 0) {
> > +return -EIO;
> > +}
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 
> I'd prefer not include following hunks to the patch, as they are unrelated.
> 
> > 
> >   /* Request
> >  [ 0 ..  3]   magic   (NBD_REQUEST_MAGIC)
> > @@ -2285,7 +2288,7 @@ static int nbd_co_receive_request(NBDRequestData 
> > *req, NBDRequest *request,
> >   assert(client->recv_coroutine == qemu_coroutine_self());
> >   ret = nbd_receive_request(client, request, errp);
> >   if (ret < 0) {
> > -return  ret;
> > +return ret;
> >   }
> > 
> >   trace_nbd_co_receive_request_decode_type(request->handle, 
> > request->type,
> > @@ -2662,7 +2665,7 @@ static coroutine_fn void nbd_trip(void *opaque)
> >   }
> > 
> >   if (ret < 0) {
> > -/* It wans't -EIO, so, according to nbd_co_receive_request()
> > +/* It wasn't -EIO, so, according to nbd_co_receive_request()

Yeah, they were typo fixes I noticed while investigating the code; I
should have either called them out in the commit message, or as you
say, split them into a separate trivial patch (in which case they
aren't urgent for 6.2).  I'm happy to do the latter, since I'm working
on more patches for qemu 7.0 to add a new NBD protocol extension
NBD_OPT_EXTENDED_HEADERS to expose your work on 64-bit write-zero
support end-to-end.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [PATCH for-6.2? 2/2] nbd/server: Simplify zero and trim

2021-11-17 Thread Eric Blake
On Wed, Nov 17, 2021 at 09:04:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 17.11.2021 20:02, Eric Blake wrote:
> > Now that the block layer supports 64-bit operations, we no longer have
> > to self-fragment requests larger than 2G, reverting the workaround
> > added in 890cbccb08 (nbd: Fix large trim/zero requests).
> > 
> > Signed-off-by: Eric Blake
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 

Any preferences on whether this should be in 6.2, or deferred to 7.0?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [PATCH for-6.2] hw/nvme: fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)

2021-11-17 Thread Philippe Mathieu-Daudé
On 11/17/21 14:23, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Fix missing offset verification.
> 
> Cc: qemu-sta...@nongnu.org
> Cc: Philippe Mathieu-Daudé 
> Reported-by: Qiuhao Li 
> Fixes: f432fdfa121 ("support changed namespace asynchronous event")
> Signed-off-by: Klaus Jensen 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Gaurav Kamathe 

> ---
> 
> Note: Since its so easy to mess this fix up, the log pages code could
> probably use a refactor - there is a lot of duplicated code as well and
> it's easy to miss a check like this. However, defer that for 7.0.
> 
>  hw/nvme/ctrl.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 6a571d18cfae..5f573c417b3d 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -4168,6 +4168,11 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, 
> uint8_t rae, uint32_t buf_len,
>  int i = 0;
>  uint32_t nsid;
>  
> +if (off >= sizeof(nslist)) {
> +trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(nslist));
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
>  memset(nslist, 0x0, sizeof(nslist));
>  trans_len = MIN(sizeof(nslist) - off, buf_len);
>  
> 




Re: [PATCH for-6.2? 2/2] nbd/server: Simplify zero and trim

2021-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2021 23:49, Eric Blake wrote:

On Wed, Nov 17, 2021 at 09:04:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:

17.11.2021 20:02, Eric Blake wrote:

Now that the block layer supports 64-bit operations, we no longer have
to self-fragment requests larger than 2G, reverting the workaround
added in 890cbccb08 (nbd: Fix large trim/zero requests).

Signed-off-by: Eric Blake


Reviewed-by: Vladimir Sementsov-Ogievskiy 


Any preferences on whether this should be in 6.2, or deferred to 7.0?



In my opinion it's good for 6.2, if we are not very strict on "only real bug fixes 
in hard freeze".

The commit is obviously safe: if it break something, it means that feature we 
already included into 6.2 is broken anyway :)

--
Best regards,
Vladimir



Re: Failing QEMU iotests

2021-11-17 Thread Thomas Huth

On 17/11/2021 20.59, John Snow wrote:



On Wed, Nov 17, 2021 at 2:45 PM Thomas Huth > wrote:


On 17/11/2021 19.13, John Snow wrote:
 >
 >
 > On Wed, Nov 17, 2021 at 5:07 AM Thomas Huth mailto:th...@redhat.com>
 > >> wrote:
 >
 >
 >        Hi!
 >
 >     I think it has been working fine for me a couple of weeks ago,
 >     but when I now run:
 >
 >        make check SPEED=slow
 >
 >     I'm getting a couple of failing iotests... not sure whether
 >     these are known issues already, so I thought I'd summarize them
 >     here:
 >
 >     *** First one is 045 in raw mode: ***
 >
 >        TEST   iotest-raw: 045 [fail]
 >     QEMU          --
 >   
  "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64"

 >     -nodefaults -display none -accel qtest
 >     QEMU_IMG      --
 >     "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
 >     QEMU_IO       --
 >     "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io" --cache
 >     writeback --aio threads -f raw
 >     QEMU_NBD      --
 >     "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
 >     IMGFMT        -- raw
 >     IMGPROTO      -- file
 >     PLATFORM      -- Linux/x86_64 thuth 4.18.0-305.19.1.el8_4.x86_64
 >     TEST_DIR      --
/home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
 >     SOCK_DIR      -- /tmp/tmphlexdrlt
 >     GDB_OPTIONS   --
 >     VALGRIND_QEMU --
 >     PRINT_QEMU_OUTPUT --
 >
 >     --- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
 >     +++ 045.out.bad
 >     @@ -1,5 +1,77 @@
 >     -...
 >     +..EE.EE  >
 >   
  +==

 >     +ERROR: test_add_fd (__main__.TestSCMFd)
 >   
  +--

 >     +Traceback (most recent call last):
 >     +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 148, in
 >     test_add_fd
 >     +    self._send_fd_by_SCM()
 >     +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line 144, in
 >     _send_fd_by_SCM
 >     +    ret = self.vm.send_fd_scm(file_path=image0)
 >     +  File "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line
 >     229, in send_fd_scm
 >     +    self._qmp.send_fd_scm(fd)
 >     +  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py", line
138,
 >     in send_fd_scm
 >     +    self._aqmp.send_fd_scm(fd)
 >     +  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py",
line 149,
 >     in _wrapper
 >     +    return func(proto, *args, **kwargs)
 >     +  File "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line
 >     644, in send_fd_scm
 >     +    sock = sock._sock  # pylint: disable=protected-access
 >     +AttributeError: 'socket' object has no attribute '_sock'
 >
 >
 > Well, that's not good.
 >
 > Can you tell me some details about what system produced this failure?
 > The python version used to run the test would be good, as well as distro
 > release, kernel version, etc.
 >
 > If you can reproduce it, I might want to give you a test branch of the
 > python code to produce some extra debugging information to help me
 > understand what's gone wrong here. Get in touch on IRC when you have
some
 > spare time if you'd like to interactively debug it.

As you likely saw in Hanna's mail a little bit later, the problem was the
old version of pylint. I did still have version 2.2 installed - after
upgrading, the problem went away.


upgrading pylint made *this* problem in *045* go away and not just the 
failure in *297*, are you positive?


Ah, no, of course not, I just mixed them up :-/

(For the records, as already discussed on IRC: It's Python 2.6.8 from RHEL8 
where the problem occurred)


 Thomas




Re: Failing QEMU iotests

2021-11-17 Thread John Snow
On Wed, Nov 17, 2021 at 4:33 PM Thomas Huth  wrote:

> On 17/11/2021 20.59, John Snow wrote:
> >
> >
> > On Wed, Nov 17, 2021 at 2:45 PM Thomas Huth  > > wrote:
> >
> > On 17/11/2021 19.13, John Snow wrote:
> >  >
> >  >
> >  > On Wed, Nov 17, 2021 at 5:07 AM Thomas Huth  > 
> >  > >> wrote:
> >  >
> >  >
> >  >Hi!
> >  >
> >  > I think it has been working fine for me a couple of weeks ago,
> >  > but when I now run:
> >  >
> >  >make check SPEED=slow
> >  >
> >  > I'm getting a couple of failing iotests... not sure whether
> >  > these are known issues already, so I thought I'd summarize
> them
> >  > here:
> >  >
> >  > *** First one is 045 in raw mode: ***
> >  >
> >  >TEST   iotest-raw: 045 [fail]
> >  > QEMU  --
> >  >
> >
>  "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-system-x86_64"
> >  > -nodefaults -display none -accel qtest
> >  > QEMU_IMG  --
> >  > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-img"
> >  > QEMU_IO   --
> >  > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-io"
> --cache
> >  > writeback --aio threads -f raw
> >  > QEMU_NBD  --
> >  > "/home/thuth/tmp/qemu-build/tests/qemu-iotests/../../qemu-nbd"
> >  > IMGFMT-- raw
> >  > IMGPROTO  -- file
> >  > PLATFORM  -- Linux/x86_64 thuth
> 4.18.0-305.19.1.el8_4.x86_64
> >  > TEST_DIR  --
> > /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
> >  > SOCK_DIR  -- /tmp/tmphlexdrlt
> >  > GDB_OPTIONS   --
> >  > VALGRIND_QEMU --
> >  > PRINT_QEMU_OUTPUT --
> >  >
> >  > --- /home/thuth/devel/qemu/tests/qemu-iotests/045.out
> >  > +++ 045.out.bad
> >  > @@ -1,5 +1,77 @@
> >  > -...
> >  > +..EE.EE  >
> >  >
> >
>  +==
> >  > +ERROR: test_add_fd (__main__.TestSCMFd)
> >  >
> >
>  +--
> >  > +Traceback (most recent call last):
> >  > +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line
> 148, in
> >  > test_add_fd
> >  > +self._send_fd_by_SCM()
> >  > +  File "/home/thuth/devel/qemu/tests/qemu-iotests/045", line
> 144, in
> >  > _send_fd_by_SCM
> >  > +ret = self.vm.send_fd_scm(file_path=image0)
> >  > +  File
> "/home/thuth/devel/qemu/python/qemu/machine/machine.py", line
> >  > 229, in send_fd_scm
> >  > +self._qmp.send_fd_scm(fd)
> >  > +  File "/home/thuth/devel/qemu/python/qemu/aqmp/legacy.py",
> line
> > 138,
> >  > in send_fd_scm
> >  > +self._aqmp.send_fd_scm(fd)
> >  > +  File "/home/thuth/devel/qemu/python/qemu/aqmp/protocol.py",
> > line 149,
> >  > in _wrapper
> >  > +return func(proto, *args, **kwargs)
> >  > +  File
> "/home/thuth/devel/qemu/python/qemu/aqmp/qmp_client.py", line
> >  > 644, in send_fd_scm
> >  > +sock = sock._sock  # pylint: disable=protected-access
> >  > +AttributeError: 'socket' object has no attribute '_sock'
> >  >
> >  >
> >  > Well, that's not good.
> >  >
> >  > Can you tell me some details about what system produced this
> failure?
> >  > The python version used to run the test would be good, as well as
> distro
> >  > release, kernel version, etc.
> >  >
> >  > If you can reproduce it, I might want to give you a test branch
> of the
> >  > python code to produce some extra debugging information to help me
> >  > understand what's gone wrong here. Get in touch on IRC when you
> have
> > some
> >  > spare time if you'd like to interactively debug it.
> >
> > As you likely saw in Hanna's mail a little bit later, the problem
> was the
> > old version of pylint. I did still have version 2.2 installed - after
> > upgrading, the problem went away.
> >
> >
> > upgrading pylint made *this* problem in *045* go away and not just the
> > failure in *297*, are you positive?
>
> Ah, no, of course not, I just mixed them up :-/
>
>
I was able to repro, I have a fix on the way but I am doing additional
testing.
I still have a fix prepared for some device-crash-test behaviors, but I
found ... another bug that's even more annoying, so there is more
development and testing to do there.

(New problem is: device-crash-test does not set a timeout for QMP
connections, so if the binary dies entirely before it dials out to connect
to the QMP library in python at all, we w

[PATCH-for-6.2 v2 0/2] hw/block/fdc: Fix CVE-2021-20196

2021-11-17 Thread Philippe Mathieu-Daudé
I'm not sure what happened to v1 from Prasad, so since we are
at rc2 I took a simpler approach to fix this CVE: create an
empty drive to satisfy the BlockBackend API calls.

Added Alexander's reproducer along.

v1: 
https://lore.kernel.org/qemu-devel/20210123100345.642933-1-ppan...@redhat.com/

Alexander Bulekov (1):
  tests/qtest/fdc-test: Add a regression test for CVE-2021-20196

Philippe Mathieu-Daudé (1):
  hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196

 hw/block/fdc.c | 14 +-
 tests/qtest/fdc-test.c | 21 +
 2 files changed, 34 insertions(+), 1 deletion(-)

-- 
2.31.1





[PATCH-for-6.2 v2 2/2] tests/qtest/fdc-test: Add a regression test for CVE-2021-20196

2021-11-17 Thread Philippe Mathieu-Daudé
From: Alexander Bulekov 

When running 'make check-qtest-i386' with QEMU configured
with '--enable-sanitizers' we get:

  AddressSanitizer:DEADLYSIGNAL
  =
  ==287878==ERROR: AddressSanitizer: SEGV on unknown address 0x0344
  ==287878==The signal is caused by a WRITE memory access.
  ==287878==Hint: address points to the zero page.
  #0 0x564b2e5bac27 in blk_inc_in_flight block/block-backend.c:1346:5
  #1 0x564b2e5bb228 in blk_pwritev_part block/block-backend.c:1317:5
  #2 0x564b2e5bcd57 in blk_pwrite block/block-backend.c:1498:11
  #3 0x564b2ca1cdd3 in fdctrl_write_data hw/block/fdc.c:2221:17
  #4 0x564b2ca1b2f7 in fdctrl_write hw/block/fdc.c:829:9
  #5 0x564b2dc49503 in portio_write softmmu/ioport.c:201:9

Signed-off-by: Alexander Bulekov 
Message-Id: <20210319050906.14875-2-alx...@bu.edu>
[PMD: Rebased, use global test_image]
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/qtest/fdc-test.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/fdc-test.c b/tests/qtest/fdc-test.c
index 26b69f7c5cd..dfca84c7b96 100644
--- a/tests/qtest/fdc-test.c
+++ b/tests/qtest/fdc-test.c
@@ -546,6 +546,26 @@ static void fuzz_registers(void)
 }
 }
 
+static void test_cve_2021_20196(void)
+{
+QTestState *s;
+
+s = qtest_initf("-nographic -m 32M -nodefaults "
+"-drive file=%s,format=raw,if=floppy", test_image);
+qtest_outw(s, 0x3f2, 0x0004);
+qtest_outw(s, 0x3f4, 0x0200);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f4, 0x);
+qtest_outw(s, 0x3f2, 0x0001);
+qtest_quit(s);
+}
+
 int main(int argc, char **argv)
 {
 int fd;
@@ -576,6 +596,7 @@ int main(int argc, char **argv)
 qtest_add_func("/fdc/read_no_dma_18", test_read_no_dma_18);
 qtest_add_func("/fdc/read_no_dma_19", test_read_no_dma_19);
 qtest_add_func("/fdc/fuzz-registers", fuzz_registers);
+qtest_add_func("/fdc/fuzz/cve_2021_20196", test_cve_2021_20196);
 
 ret = g_test_run();
 
-- 
2.31.1




[PATCH-for-6.2 v2 1/2] hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196

2021-11-17 Thread Philippe Mathieu-Daudé
Guest might select another drive on the bus by setting the
DRIVE_SEL bit of the DIGITAL OUTPUT REGISTER (DOR).
The current controller model doesn't expect a BlockBackend
to be NULL. A simple way to fix CVE-2021-20196 is to create
an empty BlockBackend when it is missing. All further
accesses will be safely handled, and the controller state
machines keep behaving correctly.

Fixes: CVE-2021-20196
Reported-by: Gaoning Pan (Ant Security Light-Year Lab) 
BugLink: https://bugs.launchpad.net/qemu/+bug/1912780
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/338
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index fa933cd3263..eab17e946d6 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -1161,7 +1161,19 @@ static FDrive *get_drv(FDCtrl *fdctrl, int unit)
 
 static FDrive *get_cur_drv(FDCtrl *fdctrl)
 {
-return get_drv(fdctrl, fdctrl->cur_drv);
+FDrive *cur_drv = get_drv(fdctrl, fdctrl->cur_drv);
+
+if (!cur_drv->blk) {
+/*
+ * Kludge: empty drive line selected. Create an anonymous
+ * BlockBackend to avoid NULL deref with various BlockBackend
+ * API calls within this model (CVE-2021-20196).
+ * Due to the controller QOM model limitations, we don't
+ * attach the created to the controller.
+ */
+cur_drv->blk = blk_new(qemu_get_aio_context(), 0, BLK_PERM_ALL);
+}
+return cur_drv;
 }
 
 /* Status A register : 0x00 (read-only) */
-- 
2.31.1




RE: [PATCH 1/2] block:hdev: support BLKSECDISCARD

2021-11-17 Thread Qi, Yadong
> What is the use case for exposing secure erase in qemu?  The whole concept for
> a LBA based secure erase is generally not a very smart idea for flash based
> media..
Hi, Christoph
We got a user requirement: support BLKSECDISCARD in VM. Which is:
ioctl(BLKSECDISCARD) in guest -> qemu backend perform secure discard
on native block device accordingly.
This is a requirement to meet the security level of the project.
So I sent out these patches to see whether qemu could accept such changes.




Re: [PULL 0/5] Python patches

2021-11-17 Thread Gerd Hoffmann
  Hi,

> - Split python/qemu/qmp out into its own repository and begin uploading it
> to PyPI, as a test. (Do not delete python/qemu/qmp yet at this phase.)

I think you can do that as two separate steps.

pip can install from vcs too, i.e. when splitted to a separate repo but
not yet uploaded to pypi you can simply drop something like ...

git+https://gitlab.com/qemu/qemu-python.git@master

... into pip-requirements.txt.  That way you can easily test things
before actually uploading to pypi.

take care,
  Gerd




Re: [PATCH v2 03/13] hw/arm/npcm7xx_boards: Replace drive_get_next() by drive_get()

2021-11-17 Thread Markus Armbruster
Havard Skinnemoen  writes:

> On Wed, Nov 17, 2021 at 8:34 AM Markus Armbruster  wrote:
>>
>> drive_get_next() is basically a bad idea.  It returns the "next" block
>> backend of a certain interface type.  "Next" means bus=0,unit=N, where
>> subsequent calls count N up from zero, per interface type.
>>
>> This lets you define unit numbers implicitly by execution order.  If the
>> order changes, or new calls appear "in the middle", unit numbers change.
>> ABI break.  Hard to spot in review.
>>
>> Machine "quanta-gbs-bmc" connects just one backend with
>> drive_get_next(), but with a helper function.  Change it to use
>> drive_get() directly.  This makes the unit numbers explicit in the
>> code.
>>
>> Cc: Havard Skinnemoen 
>> Cc: Tyrone Ting 
>> Cc: Peter Maydell 
>> Cc: qemu-...@nongnu.org
>> Signed-off-by: Markus Armbruster 
>> ---
>>  hw/arm/npcm7xx_boards.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
>> index dec7d16ae5..d8a49e4e85 100644
>> --- a/hw/arm/npcm7xx_boards.c
>> +++ b/hw/arm/npcm7xx_boards.c
>> @@ -84,9 +84,9 @@ static void npcm7xx_connect_dram(NPCM7xxState *soc, 
>> MemoryRegion *dram)
>>   &error_abort);
>>  }
>>
>> -static void sdhci_attach_drive(SDHCIState *sdhci)
>> +static void sdhci_attach_drive(SDHCIState *sdhci, int unit)
>>  {
>> -DriveInfo *di = drive_get_next(IF_SD);
>> +DriveInfo *di = drive_get(IF_SD, 0, unit);
>
> +Hao Wu IIRC the chip has separate SD and eMMC buses. Would it make
> sense to take the bus number as a parameter as well? Is bus 0 the
> right one to use in this case?
>
> The existing code always uses bus 0, so this is an improvement either way.

Using separate buses for different kinds of devices would be neater, but
it also would be an incompatible change.  I can't judge whether
incompatible change is okay here.

This patch is just refactoring code.  An interface change, if desired,
should be a separate patch.

> Reviewed-by: Havard Skinnemoen 

Thanks!

>
>>  BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
>>
>>  BusState *bus = qdev_get_child_bus(DEVICE(sdhci), "sd-bus");
>> @@ -374,7 +374,7 @@ static void quanta_gbs_init(MachineState *machine)
>>drive_get(IF_MTD, 0, 0));
>>
>>  quanta_gbs_i2c_init(soc);
>> -sdhci_attach_drive(&soc->mmc.sdhci);
>> +sdhci_attach_drive(&soc->mmc.sdhci, 0);
>>  npcm7xx_load_kernel(machine, soc);
>>  }
>>
>> --
>> 2.31.1
>>