Re: [PATCH v3 2/4] block: check for sys/disk.h

2021-03-31 Thread Joelle van Dyne
On Mon, Mar 15, 2021 at 11:03 AM Joelle van Dyne  wrote:
>
> Some BSD platforms do not have this header.
>
> Reviewed-by: Peter Maydell 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Joelle van Dyne 

Please bear with me as I am still new to this, but what happens to the
three patches that are reviewed if the last patch does not get
reviewed? Do the reviewed patches still get to make it into 6.0? I am
willing to drop the unreviewed patch if there are issues. Thanks.

-j



Re: [PATCH] block/export: improve vu_blk_sect_range_ok()

2021-03-31 Thread Eric Blake
On 3/31/21 9:27 AM, Stefan Hajnoczi wrote:
> The checks in vu_blk_sect_range_ok() assume VIRTIO_BLK_SECTOR_SIZE is
> equal to BDRV_SECTOR_SIZE. This is true, but let's add a
> QEMU_BUILD_BUG_ON() to make it explicit.
> 
> We might as well check that the request buffer size is a multiple of
> VIRTIO_BLK_SECTOR_SIZE while we're at it.
> 
> Suggested-by: Max Reitz 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/export/vhost-user-blk-server.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH] iotests: Test mirror-top filter permissions

2021-03-31 Thread Eric Blake
On 3/31/21 7:28 AM, Max Reitz wrote:
> Add a test accompanying commit 53431b9086b2832ca1aeff0c55e186e9ed79bd11
> ("block/mirror: Fix mirror_top's permissions").
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/tests/mirror-top-perms | 121 ++
>  tests/qemu-iotests/tests/mirror-top-perms.out |   5 +
>  2 files changed, 126 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/mirror-top-perms
>  create mode 100644 tests/qemu-iotests/tests/mirror-top-perms.out

Safe for -rc2 if you want to get it in.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [RFC 3/8] virtio: Add API to batch set host notifiers

2021-03-31 Thread Greg Kurz
On Wed, 31 Mar 2021 15:47:45 +0100
Stefan Hajnoczi  wrote:

> On Tue, Mar 30, 2021 at 04:17:32PM +0200, Greg Kurz wrote:
> > On Tue, 30 Mar 2021 14:55:42 +0100
> > Stefan Hajnoczi  wrote:
> > 
> > > On Tue, Mar 30, 2021 at 12:17:40PM +0200, Greg Kurz wrote:
> > > > On Mon, 29 Mar 2021 18:10:57 +0100
> > > > Stefan Hajnoczi  wrote:
> > > > > On Thu, Mar 25, 2021 at 04:07:30PM +0100, Greg Kurz wrote:
> > > > > > @@ -315,6 +338,10 @@ static void 
> > > > > > virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState *bus,
> > > > > >  
> > > > > >  for (i = 0; i < nvqs; i++) {
> > > > > >  virtio_bus_set_host_notifier(bus, i + n_offset, false);
> > > > > > +}
> > > > > > +/* Let address_space_update_ioeventfds() run before closing 
> > > > > > ioeventfds */
> > > > > 
> > > > > assert(memory_region_transaction_depth == 0)?
> > > > > 
> > > > 
> > > > Hmm... appart from the fact that memory_region_transaction_depth is
> > > > a memory internal thing that shouldn't be exposed here, it seems to
> > > > me that memory_region_transaction_depth can be != 0 when, e.g. when
> > > > batching is used... or I'm missing something ?
> > > > 
> > > > I was actually thinking of adding some asserts for that in the
> > > > memory_region_*_eventfd_full() functions introduced by patch 1.
> > > > 
> > > > if (!transaction) {
> > > > memory_region_transaction_begin();
> > > > }
> > > > assert(memory_region_transaction_depth != 0);
> > > 
> > > In that case is it safe to call virtio_bus_cleanup_host_notifier()
> > > below? I thought it depends on the transaction committing first.
> > > 
> > 
> > Yes because the transaction ends...
> > 
> > > > 
> > > > > > +virtio_bus_set_host_notifier_commit(bus);
> > ...here ^^
> > 
> > > > > > +for (i = 0; i < nvqs; i++) {
> > > > > >  virtio_bus_cleanup_host_notifier(bus, i + n_offset);
> > > > > >  }
> > > > > >  }
> 
> That contradicts what you said above: "it seems to me that
> memory_region_transaction_depth can be != 0 when, e.g. when batching is
> used".
> 
> If memory_region_transaction_depth can be != 0 when this function is
> entered then memory_region_transaction_commit() will have no effect:
> 
>   void memory_region_transaction_commit(void)
>   {
>   AddressSpace *as;
> 
>   assert(memory_region_transaction_depth);
>   assert(qemu_mutex_iothread_locked());
> 
>   --memory_region_transaction_depth;
>   if (!memory_region_transaction_depth) {

memory_region_transaction_depth should be equal to 1 when
entering the function, not 0... which is the case when
batching.

>   ^--- we won't take this branch!
> 
> So the code after memory_region_transaction_commit() cannot assume that
> anything was actually committed.
> 

Right and nothing in the current code base seems to prevent
memory_region_*_eventfd() to be called within an ongoing
transaction actually. It looks that we might want to fix that
first.

> That's why I asked about adding assert(memory_region_transaction_depth
> == 0) to guarantee that our commit takes effect immediately so that it's
> safe to call virtio_bus_cleanup_host_notifier().
> 

Yes, it was just misplaced and I didn't get the intent at first :)

> Stefan



pgpcu8f8v0gDw.pgp
Description: OpenPGP digital signature


Re: [RFC 3/8] virtio: Add API to batch set host notifiers

2021-03-31 Thread Stefan Hajnoczi
On Tue, Mar 30, 2021 at 04:17:32PM +0200, Greg Kurz wrote:
> On Tue, 30 Mar 2021 14:55:42 +0100
> Stefan Hajnoczi  wrote:
> 
> > On Tue, Mar 30, 2021 at 12:17:40PM +0200, Greg Kurz wrote:
> > > On Mon, 29 Mar 2021 18:10:57 +0100
> > > Stefan Hajnoczi  wrote:
> > > > On Thu, Mar 25, 2021 at 04:07:30PM +0100, Greg Kurz wrote:
> > > > > @@ -315,6 +338,10 @@ static void 
> > > > > virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState *bus,
> > > > >  
> > > > >  for (i = 0; i < nvqs; i++) {
> > > > >  virtio_bus_set_host_notifier(bus, i + n_offset, false);
> > > > > +}
> > > > > +/* Let address_space_update_ioeventfds() run before closing 
> > > > > ioeventfds */
> > > > 
> > > > assert(memory_region_transaction_depth == 0)?
> > > > 
> > > 
> > > Hmm... appart from the fact that memory_region_transaction_depth is
> > > a memory internal thing that shouldn't be exposed here, it seems to
> > > me that memory_region_transaction_depth can be != 0 when, e.g. when
> > > batching is used... or I'm missing something ?
> > > 
> > > I was actually thinking of adding some asserts for that in the
> > > memory_region_*_eventfd_full() functions introduced by patch 1.
> > > 
> > > if (!transaction) {
> > > memory_region_transaction_begin();
> > > }
> > > assert(memory_region_transaction_depth != 0);
> > 
> > In that case is it safe to call virtio_bus_cleanup_host_notifier()
> > below? I thought it depends on the transaction committing first.
> > 
> 
> Yes because the transaction ends...
> 
> > > 
> > > > > +virtio_bus_set_host_notifier_commit(bus);
> ...here ^^
> 
> > > > > +for (i = 0; i < nvqs; i++) {
> > > > >  virtio_bus_cleanup_host_notifier(bus, i + n_offset);
> > > > >  }
> > > > >  }

That contradicts what you said above: "it seems to me that
memory_region_transaction_depth can be != 0 when, e.g. when batching is
used".

If memory_region_transaction_depth can be != 0 when this function is
entered then memory_region_transaction_commit() will have no effect:

  void memory_region_transaction_commit(void)
  {
  AddressSpace *as;

  assert(memory_region_transaction_depth);
  assert(qemu_mutex_iothread_locked());

  --memory_region_transaction_depth;
  if (!memory_region_transaction_depth) {
  ^--- we won't take this branch!

So the code after memory_region_transaction_commit() cannot assume that
anything was actually committed.

That's why I asked about adding assert(memory_region_transaction_depth
== 0) to guarantee that our commit takes effect immediately so that it's
safe to call virtio_bus_cleanup_host_notifier().

Stefan


signature.asc
Description: PGP signature


[PATCH] block/export: improve vu_blk_sect_range_ok()

2021-03-31 Thread Stefan Hajnoczi
The checks in vu_blk_sect_range_ok() assume VIRTIO_BLK_SECTOR_SIZE is
equal to BDRV_SECTOR_SIZE. This is true, but let's add a
QEMU_BUILD_BUG_ON() to make it explicit.

We might as well check that the request buffer size is a multiple of
VIRTIO_BLK_SECTOR_SIZE while we're at it.

Suggested-by: Max Reitz 
Signed-off-by: Stefan Hajnoczi 
---
 block/export/vhost-user-blk-server.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index fa06996d37..1862563336 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -70,9 +70,16 @@ static void vu_blk_req_complete(VuBlkReq *req)
 static bool vu_blk_sect_range_ok(VuBlkExport *vexp, uint64_t sector,
  size_t size)
 {
-uint64_t nb_sectors = size >> BDRV_SECTOR_BITS;
+uint64_t nb_sectors;
 uint64_t total_sectors;
 
+if (size % VIRTIO_BLK_SECTOR_SIZE) {
+return false;
+}
+
+nb_sectors = size >> VIRTIO_BLK_SECTOR_BITS;
+
+QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != VIRTIO_BLK_SECTOR_SIZE);
 if (nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
 return false;
 }
-- 
2.30.2



[PATCH] iotests: Test mirror-top filter permissions

2021-03-31 Thread Max Reitz
Add a test accompanying commit 53431b9086b2832ca1aeff0c55e186e9ed79bd11
("block/mirror: Fix mirror_top's permissions").

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/tests/mirror-top-perms | 121 ++
 tests/qemu-iotests/tests/mirror-top-perms.out |   5 +
 2 files changed, 126 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/mirror-top-perms
 create mode 100644 tests/qemu-iotests/tests/mirror-top-perms.out

diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
new file mode 100755
index 00..451a0666f8
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -0,0 +1,121 @@
+#!/usr/bin/env python3
+# group: rw
+#
+# Test permissions taken by the mirror-top filter
+#
+# Copyright (C) 2021 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import qemu_img
+
+# Import qemu after iotests.py has amended sys.path
+# pylint: disable=wrong-import-order
+import qemu
+
+
+image_size = 1 * 1024 * 1024
+source = os.path.join(iotests.test_dir, 'source.img')
+
+
+class TestMirrorTopPerms(iotests.QMPTestCase):
+def setUp(self):
+assert qemu_img('create', '-f', iotests.imgfmt, source,
+str(image_size)) == 0
+self.vm = iotests.VM()
+self.vm.add_drive(source)
+self.vm.add_blockdev(f'null-co,node-name=null,size={image_size}')
+self.vm.launch()
+
+# Will be created by the test function itself
+self.vm_b = None
+
+def tearDown(self):
+try:
+self.vm.shutdown()
+except qemu.machine.AbnormalShutdown:
+pass
+
+if self.vm_b is not None:
+self.vm_b.shutdown()
+
+os.remove(source)
+
+def test_cancel(self):
+"""
+Before commit 53431b9086b28, mirror-top used to not take any
+permissions but WRITE and share all permissions.  Because it
+is inserted between the source's original parents and the
+source, there generally was no parent that would have taken or
+unshared any permissions on the source, which means that an
+external process could access the image unhindered by locks.
+(Unless there was a parent above the protocol node that would
+take its own locks, e.g. a format driver.)
+This is bad enough, but if the mirror job is then cancelled,
+the mirroring VM tries to take back the image, restores the
+original permissions taken and unshared, and assumes this must
+just work.  But it will not, and so the VM aborts.
+
+Commit 53431b9086b28 made mirror keep the original permissions
+and so no other process can "steal" the image.
+
+(Note that you cannot really do the same with the target image
+and then completing the job, because the mirror job always
+took/unshared the correct permissions on the target.  For
+example, it does not share READ_CONSISTENT, which makes it
+difficult to let some other qemu process open the image.)
+"""
+
+result = self.vm.qmp('blockdev-mirror',
+ job_id='mirror',
+ device='drive0',
+ target='null',
+ sync='full')
+self.assert_qmp(result, 'return', {})
+
+self.vm.event_wait('BLOCK_JOB_READY')
+
+# We want this to fail because the image cannot be locked.
+# If it does not fail, continue still and see what happens.
+self.vm_b = iotests.VM(path_suffix='b')
+# Must use -blockdev -device so we can use share-rw.
+# (And we need share-rw=on because mirror-top was always
+# forced to take the WRITE permission so it can write to the
+# source image.)
+self.vm_b.add_blockdev(f'file,node-name=drive0,filename={source}')
+self.vm_b.add_device('virtio-blk,drive=drive0,share-rw=on')
+try:
+self.vm_b.launch()
+print('ERROR: VM B launched successfully, this should not have '
+  'happened')
+except qemu.qmp.QMPConnectError:
+assert 'Is another process using the image' in self.vm_b.get_log()
+
+result = self.vm.qmp('block-job-cancel',
+ device='mirror')
+

[PULL for-6.0 6/6] test-coroutine: Add rwlock downgrade test

2021-03-31 Thread Stefan Hajnoczi
From: David Edmondson 

Test that downgrading an rwlock does not result in a failure to
schedule coroutines queued on the rwlock.

The diagram associated with test_co_rwlock_downgrade() describes the
intended behaviour, but what was observed previously corresponds to:

| c1 | c2 | c3 | c4   |
|+++--|
| rdlock |||  |
| yield  |||  |
|| wrlock ||  |
||||  |
||| rdlock |  |
||||  |
|||| wrlock   |
||||  |
| unlock |||  |
| yield  |||  |
||  ||  |
|| downgrade  ||  |
|| ...||  |
|| unlock ||  |
|||  |  |
||||  |

This results in a failure...

ERROR:../tests/test-coroutine.c:369:test_co_rwlock_downgrade: assertion failed: 
(c3_done)
Bail out! ERROR:../tests/test-coroutine.c:369:test_co_rwlock_downgrade: 
assertion failed: (c3_done)

...as a result of the c3 coroutine failing to run to completion.

Signed-off-by: David Edmondson 
Signed-off-by: Paolo Bonzini 
Message-id: 20210325112941.365238-7-pbonz...@redhat.com
Message-Id: <20210309144015.557477-5-david.edmond...@oracle.com>
Signed-off-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 tests/unit/test-coroutine.c | 99 +
 1 file changed, 99 insertions(+)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index 6e6f51d480..aa77a3bcb3 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -325,6 +325,104 @@ static void test_co_rwlock_upgrade(void)
 g_assert(c2_done);
 }
 
+static void coroutine_fn rwlock_rdlock_yield(void *opaque)
+{
+qemu_co_rwlock_rdlock();
+qemu_coroutine_yield();
+
+qemu_co_rwlock_unlock();
+qemu_coroutine_yield();
+
+*(bool *)opaque = true;
+}
+
+static void coroutine_fn rwlock_wrlock_downgrade(void *opaque)
+{
+qemu_co_rwlock_wrlock();
+
+qemu_co_rwlock_downgrade();
+qemu_co_rwlock_unlock();
+*(bool *)opaque = true;
+}
+
+static void coroutine_fn rwlock_rdlock(void *opaque)
+{
+qemu_co_rwlock_rdlock();
+
+qemu_co_rwlock_unlock();
+*(bool *)opaque = true;
+}
+
+static void coroutine_fn rwlock_wrlock(void *opaque)
+{
+qemu_co_rwlock_wrlock();
+
+qemu_co_rwlock_unlock();
+*(bool *)opaque = true;
+}
+
+/*
+ * Check that downgrading a reader-writer lock does not cause a hang.
+ *
+ * Four coroutines are used to produce a situation where there are
+ * both reader and writer hopefuls waiting to acquire an rwlock that
+ * is held by a reader.
+ *
+ * The correct sequence of operations we aim to provoke can be
+ * represented as:
+ *
+ * | c1 | c2 | c3 | c4 |
+ * |+++|
+ * | rdlock ||||
+ * | yield  ||||
+ * || wrlock |||
+ * |||||
+ * ||| rdlock ||
+ * |||||
+ * |||| wrlock |
+ * |||||
+ * | unlock ||||
+ * | yield  ||||
+ * ||  |||
+ * || downgrade  |||
+ * |||  ||
+ * ||| unlock ||
+ * || ...|||
+ * || unlock |||
+ * ||||  |
+ * |||| unlock |
+ */
+static void test_co_rwlock_downgrade(void)
+{
+bool c1_done = false;
+bool c2_done = false;
+bool c3_done = false;
+bool c4_done = false;
+Coroutine *c1, *c2, *c3, *c4;
+
+qemu_co_rwlock_init();
+
+c1 = qemu_coroutine_create(rwlock_rdlock_yield, _done);
+c2 = qemu_coroutine_create(rwlock_wrlock_downgrade, _done);
+c3 = qemu_coroutine_create(rwlock_rdlock, _done);
+c4 = qemu_coroutine_create(rwlock_wrlock, _done);
+
+qemu_coroutine_enter(c1);
+qemu_coroutine_enter(c2);
+qemu_coroutine_enter(c3);
+qemu_coroutine_enter(c4);
+
+qemu_coroutine_enter(c1);
+
+g_assert(c2_done);
+g_assert(c3_done);
+g_assert(c4_done);
+
+qemu_coroutine_enter(c1);
+
+g_assert(c1_done);
+}
+
 /*
  * Check that creation, enter, and return work
  */
@@ -563,6 +661,7 @@ int main(int argc, char **argv)
 g_test_add_func("/locking/co-mutex", 

[PULL for-6.0 5/6] test-coroutine: Add rwlock upgrade test

2021-03-31 Thread Stefan Hajnoczi
From: Paolo Bonzini 

Test that rwlock upgrade is fair, and that readers go back to sleep if
a writer is in line.

Signed-off-by: Paolo Bonzini 
Message-id: 20210325112941.365238-6-pbonz...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 tests/unit/test-coroutine.c | 62 +
 1 file changed, 62 insertions(+)

diff --git a/tests/unit/test-coroutine.c b/tests/unit/test-coroutine.c
index e946d93a65..6e6f51d480 100644
--- a/tests/unit/test-coroutine.c
+++ b/tests/unit/test-coroutine.c
@@ -264,6 +264,67 @@ static void test_co_mutex_lockable(void)
 g_assert(QEMU_MAKE_LOCKABLE(null_pointer) == NULL);
 }
 
+static CoRwlock rwlock;
+
+/* Test that readers are properly sent back to the queue when upgrading,
+ * even if they are the sole readers.  The test scenario is as follows:
+ *
+ *
+ * | c1   | c2 |
+ * |--++
+ * | rdlock   ||
+ * | yield||
+ * |  | wrlock |
+ * |  ||
+ * | upgrade  ||
+ * |  |  |
+ * |  | unlock |
+ * |||
+ * | unlock   ||
+ */
+
+static void coroutine_fn rwlock_yield_upgrade(void *opaque)
+{
+qemu_co_rwlock_rdlock();
+qemu_coroutine_yield();
+
+qemu_co_rwlock_upgrade();
+qemu_co_rwlock_unlock();
+
+*(bool *)opaque = true;
+}
+
+static void coroutine_fn rwlock_wrlock_yield(void *opaque)
+{
+qemu_co_rwlock_wrlock();
+qemu_coroutine_yield();
+
+qemu_co_rwlock_unlock();
+*(bool *)opaque = true;
+}
+
+static void test_co_rwlock_upgrade(void)
+{
+bool c1_done = false;
+bool c2_done = false;
+Coroutine *c1, *c2;
+
+qemu_co_rwlock_init();
+c1 = qemu_coroutine_create(rwlock_yield_upgrade, _done);
+c2 = qemu_coroutine_create(rwlock_wrlock_yield, _done);
+
+qemu_coroutine_enter(c1);
+qemu_coroutine_enter(c2);
+
+/* c1 now should go to sleep.  */
+qemu_coroutine_enter(c1);
+g_assert(!c1_done);
+
+qemu_coroutine_enter(c2);
+g_assert(c1_done);
+g_assert(c2_done);
+}
+
 /*
  * Check that creation, enter, and return work
  */
@@ -501,6 +562,7 @@ int main(int argc, char **argv)
 g_test_add_func("/basic/order", test_order);
 g_test_add_func("/locking/co-mutex", test_co_mutex);
 g_test_add_func("/locking/co-mutex/lockable", test_co_mutex_lockable);
+g_test_add_func("/locking/co-rwlock/upgrade", test_co_rwlock_upgrade);
 if (g_test_perf()) {
 g_test_add_func("/perf/lifecycle", perf_lifecycle);
 g_test_add_func("/perf/nesting", perf_nesting);
-- 
2.30.2



[PULL for-6.0 4/6] coroutine-lock: Reimplement CoRwlock to fix downgrade bug

2021-03-31 Thread Stefan Hajnoczi
From: Paolo Bonzini 

An invariant of the current rwlock is that if multiple coroutines hold a
reader lock, all must be runnable. The unlock implementation relies on
this, choosing to wake a single coroutine when the final read lock
holder exits the critical section, assuming that it will wake a
coroutine attempting to acquire a write lock.

The downgrade implementation violates this assumption by creating a
read lock owning coroutine that is exclusively runnable - any other
coroutines that are waiting to acquire a read lock are *not* made
runnable when the write lock holder converts its ownership to read
only.

More in general, the old implementation had lots of other fairness bugs.
The root cause of the bugs was that CoQueue would wake up readers even
if there were pending writers, and would wake up writers even if there
were readers.  In that case, the coroutine would go back to sleep *at
the end* of the CoQueue, losing its place at the head of the line.

To fix this, keep the queue of waiters explicitly in the CoRwlock
instead of using CoQueue, and store for each whether it is a
potential reader or a writer.  This way, downgrade can look at the
first queued coroutines and wake it only if it is a reader, causing
all other readers in line to be released in turn.

Reported-by: David Edmondson 
Reviewed-by: David Edmondson 
Signed-off-by: Paolo Bonzini 
Message-id: 20210325112941.365238-5-pbonz...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 include/qemu/coroutine.h   |  17 ++--
 util/qemu-coroutine-lock.c | 164 +++--
 2 files changed, 114 insertions(+), 67 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 84eab6e3bf..ce5b9c6851 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -237,11 +237,15 @@ bool qemu_co_enter_next_impl(CoQueue *queue, QemuLockable 
*lock);
 bool qemu_co_queue_empty(CoQueue *queue);
 
 
+typedef struct CoRwTicket CoRwTicket;
 typedef struct CoRwlock {
-int pending_writer;
-int reader;
 CoMutex mutex;
-CoQueue queue;
+
+/* Number of readers, or -1 if owned for writing.  */
+int owners;
+
+/* Waiting coroutines.  */
+QSIMPLEQ_HEAD(, CoRwTicket) tickets;
 } CoRwlock;
 
 /**
@@ -260,10 +264,9 @@ void qemu_co_rwlock_rdlock(CoRwlock *lock);
 /**
  * Write Locks the CoRwlock from a reader.  This is a bit more efficient than
  * @qemu_co_rwlock_unlock followed by a separate @qemu_co_rwlock_wrlock.
- * However, if the lock cannot be upgraded immediately, control is transferred
- * to the caller of the current coroutine.  Also, @qemu_co_rwlock_upgrade
- * only overrides CoRwlock fairness if there are no concurrent readers, so
- * another writer might run while @qemu_co_rwlock_upgrade blocks.
+ * Note that if the lock cannot be upgraded immediately, control is transferred
+ * to the caller of the current coroutine; another writer might run while
+ * @qemu_co_rwlock_upgrade blocks.
  */
 void qemu_co_rwlock_upgrade(CoRwlock *lock);
 
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index eb73cf11dc..2669403839 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -327,11 +327,51 @@ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex)
 trace_qemu_co_mutex_unlock_return(mutex, self);
 }
 
+struct CoRwTicket {
+bool read;
+Coroutine *co;
+QSIMPLEQ_ENTRY(CoRwTicket) next;
+};
+
 void qemu_co_rwlock_init(CoRwlock *lock)
 {
-memset(lock, 0, sizeof(*lock));
-qemu_co_queue_init(>queue);
 qemu_co_mutex_init(>mutex);
+lock->owners = 0;
+QSIMPLEQ_INIT(>tickets);
+}
+
+/* Releases the internal CoMutex.  */
+static void qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
+{
+CoRwTicket *tkt = QSIMPLEQ_FIRST(>tickets);
+Coroutine *co = NULL;
+
+/*
+ * Setting lock->owners here prevents rdlock and wrlock from
+ * sneaking in between unlock and wake.
+ */
+
+if (tkt) {
+if (tkt->read) {
+if (lock->owners >= 0) {
+lock->owners++;
+co = tkt->co;
+}
+} else {
+if (lock->owners == 0) {
+lock->owners = -1;
+co = tkt->co;
+}
+}
+}
+
+if (co) {
+QSIMPLEQ_REMOVE_HEAD(>tickets, next);
+qemu_co_mutex_unlock(>mutex);
+aio_co_wake(co);
+} else {
+qemu_co_mutex_unlock(>mutex);
+}
 }
 
 void qemu_co_rwlock_rdlock(CoRwlock *lock)
@@ -340,84 +380,88 @@ void qemu_co_rwlock_rdlock(CoRwlock *lock)
 
 qemu_co_mutex_lock(>mutex);
 /* For fairness, wait if a writer is in line.  */
-while (lock->pending_writer) {
-qemu_co_queue_wait(>queue, >mutex);
-}
-lock->reader++;
-qemu_co_mutex_unlock(>mutex);
-
-/* The rest of the read-side critical section is run without the mutex.  */
-self->locks_held++;
-}
-
-void qemu_co_rwlock_unlock(CoRwlock *lock)
-{
-Coroutine *self = 

[PULL for-6.0 1/6] block/vdi: When writing new bmap entry fails, don't leak the buffer

2021-03-31 Thread Stefan Hajnoczi
From: David Edmondson 

If a new bitmap entry is allocated, requiring the entire block to be
written, avoiding leaking the buffer allocated for the block should
the write fail.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: David Edmondson 
Signed-off-by: Paolo Bonzini 
Acked-by: Max Reitz 
Message-id: 20210325112941.365238-2-pbonz...@redhat.com
Message-Id: <20210309144015.557477-2-david.edmond...@oracle.com>
Acked-by: Max Reitz 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 block/vdi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/vdi.c b/block/vdi.c
index 5627e7d764..2a6dc26124 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -690,6 +690,7 @@ nonallocating_write:
 
 logout("finished data write\n");
 if (ret < 0) {
+g_free(block);
 return ret;
 }
 
-- 
2.30.2



[PULL for-6.0 3/6] coroutine-lock: Store the coroutine in the CoWaitRecord only once

2021-03-31 Thread Stefan Hajnoczi
From: David Edmondson 

When taking the slow path for mutex acquisition, set the coroutine
value in the CoWaitRecord in push_waiter(), rather than both there and
in the caller.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: David Edmondson 
Signed-off-by: Paolo Bonzini 
Message-id: 20210325112941.365238-4-pbonz...@redhat.com
Message-Id: <20210309144015.557477-4-david.edmond...@oracle.com>
Signed-off-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 util/qemu-coroutine-lock.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 5816bf8900..eb73cf11dc 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -204,7 +204,6 @@ static void coroutine_fn 
qemu_co_mutex_lock_slowpath(AioContext *ctx,
 unsigned old_handoff;
 
 trace_qemu_co_mutex_lock_entry(mutex, self);
-w.co = self;
 push_waiter(mutex, );
 
 /* This is the "Responsibility Hand-Off" protocol; a lock() picks from
-- 
2.30.2



[PULL for-6.0 2/6] block/vdi: Don't assume that blocks are larger than VdiHeader

2021-03-31 Thread Stefan Hajnoczi
From: David Edmondson 

Given that the block size is read from the header of the VDI file, a
wide variety of sizes might be seen. Rather than re-using a block
sized memory region when writing the VDI header, allocate an
appropriately sized buffer.

Signed-off-by: David Edmondson 
Signed-off-by: Paolo Bonzini 
Acked-by: Max Reitz 
Message-id: 20210325112941.365238-3-pbonz...@redhat.com
Message-Id: <20210309144015.557477-3-david.edmond...@oracle.com>
Acked-by: Max Reitz 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 block/vdi.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 2a6dc26124..548f8a057b 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -696,18 +696,20 @@ nonallocating_write:
 
 if (block) {
 /* One or more new blocks were allocated. */
-VdiHeader *header = (VdiHeader *) block;
+VdiHeader *header;
 uint8_t *base;
 uint64_t offset;
 uint32_t n_sectors;
 
+g_free(block);
+header = g_malloc(sizeof(*header));
+
 logout("now writing modified header\n");
 assert(VDI_IS_ALLOCATED(bmap_first));
 *header = s->header;
 vdi_header_to_le(header);
-ret = bdrv_pwrite(bs->file, 0, block, sizeof(VdiHeader));
-g_free(block);
-block = NULL;
+ret = bdrv_pwrite(bs->file, 0, header, sizeof(*header));
+g_free(header);
 
 if (ret < 0) {
 return ret;
-- 
2.30.2



[PULL for-6.0 0/6] Block patches

2021-03-31 Thread Stefan Hajnoczi
The following changes since commit 6d40ce00c1166c317e298ad82ecf10e650c4f87d:

  Update version for v6.0.0-rc1 release (2021-03-30 18:19:07 +0100)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to b6489ac06695e257ea0a9841364577e247fdee30:

  test-coroutine: Add rwlock downgrade test (2021-03-31 10:44:21 +0100)


Pull request

A fix for VDI image files and more generally for CoRwlock.



David Edmondson (4):
  block/vdi: When writing new bmap entry fails, don't leak the buffer
  block/vdi: Don't assume that blocks are larger than VdiHeader
  coroutine-lock: Store the coroutine in the CoWaitRecord only once
  test-coroutine: Add rwlock downgrade test

Paolo Bonzini (2):
  coroutine-lock: Reimplement CoRwlock to fix downgrade bug
  test-coroutine: Add rwlock upgrade test

 include/qemu/coroutine.h|  17 ++--
 block/vdi.c |  11 ++-
 tests/unit/test-coroutine.c | 161 +++
 util/qemu-coroutine-lock.c  | 165 +++-
 4 files changed, 282 insertions(+), 72 deletions(-)

-- 
2.30.2



[PATCH v3] hw/block/nvme: add device self test command support

2021-03-31 Thread Gollu Appalanaidu
This is to add support for Device Self Test Command (DST) and
DST Log Page. Refer NVM Express specification 1.4b section 5.8
("Device Self-test command")

Signed-off-by: Gollu Appalanaidu 
---
changes:
 -v3: removed unwanted patch file added

 -v2: addressed style fixes in hw/block/nvme.h

 hw/block/nvme.c   | 118 +-
 hw/block/nvme.h   |  13 +
 hw/block/trace-events |   1 +
 include/block/nvme.h  |  49 ++
 4 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6842b01ab5..3c2186b170 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -214,6 +214,7 @@ static const uint32_t nvme_cse_acs[256] = {
 [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP,
 [NVME_ADM_CMD_NS_ATTACHMENT]= NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_NIC,
 [NVME_ADM_CMD_FORMAT_NVM]   = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_ADM_CMD_DST]  = NVME_CMD_EFF_CSUPP,
 };
 
 static const uint32_t nvme_cse_iocs_none[256];
@@ -3980,6 +3981,34 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t 
csi, uint32_t buf_len,
 return nvme_c2h(n, ((uint8_t *)) + off, trans_len, req);
 }
 
+static uint16_t nvme_dst_info(NvmeCtrl *n,  uint32_t buf_len, uint64_t off,
+  NvmeRequest *req)
+{
+NvmeDstLogPage dst_log = {};
+NvmeDst *dst;
+NvmeDstEntry *traverser;
+uint32_t trans_len;
+uint8_t entry_index = 0;
+dst = >dst;
+
+if (off >= sizeof(dst_log)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+dst_log.current_dsto = dst->current_dsto;
+dst_log.current_dstc = dst->current_dstc;
+
+QTAILQ_FOREACH(traverser, >dst_list, entry) {
+memcpy(_log.dst_result[entry_index],
+>dst_entry, sizeof(NvmeSelfTestResult));
+entry_index++;
+}
+
+trans_len = MIN(sizeof(dst_log) - off, buf_len);
+
+return nvme_c2h(n, ((uint8_t *)_log) + off, trans_len, req);
+}
+
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeCmd *cmd = >cmd;
@@ -4027,6 +4056,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_changed_nslist(n, rae, len, off, req);
 case NVME_LOG_CMD_EFFECTS:
 return nvme_cmd_effects(n, csi, len, off, req);
+case NVME_LOG_DEV_SELF_TEST:
+return nvme_dst_info(n, len, off, req);
 default:
 trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
 return NVME_INVALID_FIELD | NVME_DNR;
@@ -5069,6 +5100,73 @@ static uint16_t nvme_format(NvmeCtrl *n, NvmeRequest 
*req)
 return req->status;
 }
 
+static void nvme_dst_create_entry(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+NvmeDstEntry *cur_entry;
+time_t current_ms;
+
+cur_entry = QTAILQ_LAST(>dst.dst_list);
+QTAILQ_REMOVE(>dst.dst_list, cur_entry, entry);
+memset(cur_entry, 0x0, sizeof(NvmeDstEntry));
+
+cur_entry->dst_entry.dst_status = stc << 4;
+
+if ((n->temperature >= n->features.temp_thresh_hi) ||
+(n->temperature <= n->features.temp_thresh_low)) {
+cur_entry->dst_entry.dst_status |= NVME_DST_WITH_FAILED_SEG;
+cur_entry->dst_entry.segment_number = NVME_SMART_CHECK;
+}
+
+current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+cur_entry->dst_entry.poh = cpu_to_le64current_ms -
+n->starttime_ms) / 1000) / 60) / 60);
+cur_entry->dst_entry.nsid = nsid;
+
+QTAILQ_INSERT_HEAD(>dst.dst_list, cur_entry, entry);
+}
+
+static uint16_t nvme_dst_processing(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+/*
+ * n->dst.current_dsto will be always 0x0 or NO DST OPERATION,
+ * since no background device self test operation takes place.
+ */
+assert(n->dst.current_dsto == NVME_DST_NO_OPERATION);
+
+if (stc == NVME_ABORT_DSTO) {
+goto out;
+}
+if (stc == NVME_SHORT_DSTO || stc == NVME_EXTENDED_DSTO) {
+nvme_dst_create_entry(n, nsid, stc);
+}
+
+out:
+n->dst.current_dstc = NVME_DST_OPERATION_COMPLETED;
+return NVME_SUCCESS;
+}
+
+static uint16_t nvme_dst(NvmeCtrl *n, NvmeRequest *req)
+{
+uint32_t dw10 = le32_to_cpu(req->cmd.cdw10);
+uint32_t nsid = le32_to_cpu(req->cmd.nsid);
+uint8_t stc = dw10 & 0xf;
+
+trace_pci_nvme_dst(nvme_cid(req), nsid, stc);
+
+if (!nvme_nsid_valid(n, nsid) && nsid != 0) {
+return NVME_INVALID_NSID | NVME_DNR;
+}
+
+if (nsid != NVME_NSID_BROADCAST && nsid != 0 &&
+!nvme_ns(n, nsid)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+return nvme_dst_processing(n, nsid, stc);
+}
+
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
 {
 trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcode,
@@ -5109,6 +5207,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_ns_attachment(n, req);
 case 

Re: [PATCH v2] hw/block/nvme: add device self test command support

2021-03-31 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20210331083306.12461-1-anaidu.go...@samsung.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210331083306.12461-1-anaidu.go...@samsung.com
Subject: [PATCH v2] hw/block/nvme: add device self test command support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20210331083306.12461-1-anaidu.go...@samsung.com -> 
patchew/20210331083306.12461-1-anaidu.go...@samsung.com
Switched to a new branch 'test'
018f869 hw/block/nvme: add device self test command support

=== OUTPUT BEGIN ===
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#337: 
new file mode 100644

ERROR: trailing whitespace
#369: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:28:
+ $

ERROR: trailing whitespace
#374: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:33:
+ $

ERROR: trailing whitespace
#418: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:77:
+ $

ERROR: trailing whitespace
#512: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:171:
+ $

ERROR: trailing whitespace
#515: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:174:
+ $

ERROR: trailing whitespace
#522: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:181:
+ $

ERROR: trailing whitespace
#535: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:194:
+ $

ERROR: trailing whitespace
#544: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:203:
+ $

ERROR: trailing whitespace
#566: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:225:
+ $

ERROR: trailing whitespace
#602: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:261:
+ $

ERROR: trailing whitespace
#657: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:316:
+ $

ERROR: trailing whitespace
#665: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:324:
+ $

ERROR: trailing whitespace
#674: FILE: 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch:333:
+-- $

total: 13 errors, 1 warnings, 617 lines checked

Commit 018f869ec2d4 (hw/block/nvme: add device self test command support) has 
style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210331083306.12461-1-anaidu.go...@samsung.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH v2] hw/block/nvme: add device self test command support

2021-03-31 Thread Gollu Appalanaidu
This is to add support for Device Self Test Command (DST) and
DST Log Page. Refer NVM Express specification 1.4b section 5.8
("Device Self-test command")

Signed-off-by: Gollu Appalanaidu 
---
changes:

-v2: addressed style fixes in hw/block/nvme.h

 hw/block/nvme.c   | 118 +-
 hw/block/nvme.h   |  13 +
 hw/block/trace-events |   1 +
 include/block/nvme.h  |  49 +++
 ...add-device-self-test-command-support.patch | 335 ++
 5 files changed, 515 insertions(+), 1 deletion(-)
 create mode 100644 
outgoing/0001-hw-block-nvme-add-device-self-test-command-support.patch

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6842b01ab5..3c2186b170 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -214,6 +214,7 @@ static const uint32_t nvme_cse_acs[256] = {
 [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP,
 [NVME_ADM_CMD_NS_ATTACHMENT]= NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_NIC,
 [NVME_ADM_CMD_FORMAT_NVM]   = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_ADM_CMD_DST]  = NVME_CMD_EFF_CSUPP,
 };
 
 static const uint32_t nvme_cse_iocs_none[256];
@@ -3980,6 +3981,34 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t 
csi, uint32_t buf_len,
 return nvme_c2h(n, ((uint8_t *)) + off, trans_len, req);
 }
 
+static uint16_t nvme_dst_info(NvmeCtrl *n,  uint32_t buf_len, uint64_t off,
+  NvmeRequest *req)
+{
+NvmeDstLogPage dst_log = {};
+NvmeDst *dst;
+NvmeDstEntry *traverser;
+uint32_t trans_len;
+uint8_t entry_index = 0;
+dst = >dst;
+
+if (off >= sizeof(dst_log)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+dst_log.current_dsto = dst->current_dsto;
+dst_log.current_dstc = dst->current_dstc;
+
+QTAILQ_FOREACH(traverser, >dst_list, entry) {
+memcpy(_log.dst_result[entry_index],
+>dst_entry, sizeof(NvmeSelfTestResult));
+entry_index++;
+}
+
+trans_len = MIN(sizeof(dst_log) - off, buf_len);
+
+return nvme_c2h(n, ((uint8_t *)_log) + off, trans_len, req);
+}
+
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeCmd *cmd = >cmd;
@@ -4027,6 +4056,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_changed_nslist(n, rae, len, off, req);
 case NVME_LOG_CMD_EFFECTS:
 return nvme_cmd_effects(n, csi, len, off, req);
+case NVME_LOG_DEV_SELF_TEST:
+return nvme_dst_info(n, len, off, req);
 default:
 trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
 return NVME_INVALID_FIELD | NVME_DNR;
@@ -5069,6 +5100,73 @@ static uint16_t nvme_format(NvmeCtrl *n, NvmeRequest 
*req)
 return req->status;
 }
 
+static void nvme_dst_create_entry(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+NvmeDstEntry *cur_entry;
+time_t current_ms;
+
+cur_entry = QTAILQ_LAST(>dst.dst_list);
+QTAILQ_REMOVE(>dst.dst_list, cur_entry, entry);
+memset(cur_entry, 0x0, sizeof(NvmeDstEntry));
+
+cur_entry->dst_entry.dst_status = stc << 4;
+
+if ((n->temperature >= n->features.temp_thresh_hi) ||
+(n->temperature <= n->features.temp_thresh_low)) {
+cur_entry->dst_entry.dst_status |= NVME_DST_WITH_FAILED_SEG;
+cur_entry->dst_entry.segment_number = NVME_SMART_CHECK;
+}
+
+current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+cur_entry->dst_entry.poh = cpu_to_le64current_ms -
+n->starttime_ms) / 1000) / 60) / 60);
+cur_entry->dst_entry.nsid = nsid;
+
+QTAILQ_INSERT_HEAD(>dst.dst_list, cur_entry, entry);
+}
+
+static uint16_t nvme_dst_processing(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+/*
+ * n->dst.current_dsto will be always 0x0 or NO DST OPERATION,
+ * since no background device self test operation takes place.
+ */
+assert(n->dst.current_dsto == NVME_DST_NO_OPERATION);
+
+if (stc == NVME_ABORT_DSTO) {
+goto out;
+}
+if (stc == NVME_SHORT_DSTO || stc == NVME_EXTENDED_DSTO) {
+nvme_dst_create_entry(n, nsid, stc);
+}
+
+out:
+n->dst.current_dstc = NVME_DST_OPERATION_COMPLETED;
+return NVME_SUCCESS;
+}
+
+static uint16_t nvme_dst(NvmeCtrl *n, NvmeRequest *req)
+{
+uint32_t dw10 = le32_to_cpu(req->cmd.cdw10);
+uint32_t nsid = le32_to_cpu(req->cmd.nsid);
+uint8_t stc = dw10 & 0xf;
+
+trace_pci_nvme_dst(nvme_cid(req), nsid, stc);
+
+if (!nvme_nsid_valid(n, nsid) && nsid != 0) {
+return NVME_INVALID_NSID | NVME_DNR;
+}
+
+if (nsid != NVME_NSID_BROADCAST && nsid != 0 &&
+!nvme_ns(n, nsid)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+return nvme_dst_processing(n, nsid, stc);
+}
+
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
 {
 trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), 

Re: [PATCH] hw/block/nvme: add device self test command support

2021-03-31 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20210331073233.11198-1-anaidu.go...@samsung.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210331073233.11198-1-anaidu.go...@samsung.com
Subject: [PATCH] hw/block/nvme: add device self test command support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20210331073233.11198-1-anaidu.go...@samsung.com -> 
patchew/20210331073233.11198-1-anaidu.go...@samsung.com
Switched to a new branch 'test'
d2f401d hw/block/nvme: add device self test command support

=== OUTPUT BEGIN ===
ERROR: named QTAILQ_HEAD should be typedefed separately
#210: FILE: hw/block/nvme.h:165:
+QTAILQ_HEAD(dst_list, NvmeDstEntry)  dst_list;

total: 1 errors, 0 warnings, 282 lines checked

Commit d2f401dadafe (hw/block/nvme: add device self test command support) has 
style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210331073233.11198-1-anaidu.go...@samsung.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH] hw/block/nvme: add device self test command support

2021-03-31 Thread Gollu Appalanaidu
This is to add support for Device Self Test Command (DST) and
DST Log Page. Refer NVM Express specification 1.4b section 5.8
("Device Self-test command")

Signed-off-by: Gollu Appalanaidu 
---
 hw/block/nvme.c   | 118 +-
 hw/block/nvme.h   |  13 +
 hw/block/trace-events |   1 +
 include/block/nvme.h  |  49 ++
 4 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6842b01ab5..3c2186b170 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -214,6 +214,7 @@ static const uint32_t nvme_cse_acs[256] = {
 [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP,
 [NVME_ADM_CMD_NS_ATTACHMENT]= NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_NIC,
 [NVME_ADM_CMD_FORMAT_NVM]   = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_ADM_CMD_DST]  = NVME_CMD_EFF_CSUPP,
 };
 
 static const uint32_t nvme_cse_iocs_none[256];
@@ -3980,6 +3981,34 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t 
csi, uint32_t buf_len,
 return nvme_c2h(n, ((uint8_t *)) + off, trans_len, req);
 }
 
+static uint16_t nvme_dst_info(NvmeCtrl *n,  uint32_t buf_len, uint64_t off,
+  NvmeRequest *req)
+{
+NvmeDstLogPage dst_log = {};
+NvmeDst *dst;
+NvmeDstEntry *traverser;
+uint32_t trans_len;
+uint8_t entry_index = 0;
+dst = >dst;
+
+if (off >= sizeof(dst_log)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+dst_log.current_dsto = dst->current_dsto;
+dst_log.current_dstc = dst->current_dstc;
+
+QTAILQ_FOREACH(traverser, >dst_list, entry) {
+memcpy(_log.dst_result[entry_index],
+>dst_entry, sizeof(NvmeSelfTestResult));
+entry_index++;
+}
+
+trans_len = MIN(sizeof(dst_log) - off, buf_len);
+
+return nvme_c2h(n, ((uint8_t *)_log) + off, trans_len, req);
+}
+
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeCmd *cmd = >cmd;
@@ -4027,6 +4056,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_changed_nslist(n, rae, len, off, req);
 case NVME_LOG_CMD_EFFECTS:
 return nvme_cmd_effects(n, csi, len, off, req);
+case NVME_LOG_DEV_SELF_TEST:
+return nvme_dst_info(n, len, off, req);
 default:
 trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
 return NVME_INVALID_FIELD | NVME_DNR;
@@ -5069,6 +5100,73 @@ static uint16_t nvme_format(NvmeCtrl *n, NvmeRequest 
*req)
 return req->status;
 }
 
+static void nvme_dst_create_entry(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+NvmeDstEntry *cur_entry;
+time_t current_ms;
+
+cur_entry = QTAILQ_LAST(>dst.dst_list);
+QTAILQ_REMOVE(>dst.dst_list, cur_entry, entry);
+memset(cur_entry, 0x0, sizeof(NvmeDstEntry));
+
+cur_entry->dst_entry.dst_status = stc << 4;
+
+if ((n->temperature >= n->features.temp_thresh_hi) ||
+(n->temperature <= n->features.temp_thresh_low)) {
+cur_entry->dst_entry.dst_status |= NVME_DST_WITH_FAILED_SEG;
+cur_entry->dst_entry.segment_number = NVME_SMART_CHECK;
+}
+
+current_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+cur_entry->dst_entry.poh = cpu_to_le64current_ms -
+n->starttime_ms) / 1000) / 60) / 60);
+cur_entry->dst_entry.nsid = nsid;
+
+QTAILQ_INSERT_HEAD(>dst.dst_list, cur_entry, entry);
+}
+
+static uint16_t nvme_dst_processing(NvmeCtrl *n, uint32_t nsid,
+uint8_t stc)
+{
+/*
+ * n->dst.current_dsto will be always 0x0 or NO DST OPERATION,
+ * since no background device self test operation takes place.
+ */
+assert(n->dst.current_dsto == NVME_DST_NO_OPERATION);
+
+if (stc == NVME_ABORT_DSTO) {
+goto out;
+}
+if (stc == NVME_SHORT_DSTO || stc == NVME_EXTENDED_DSTO) {
+nvme_dst_create_entry(n, nsid, stc);
+}
+
+out:
+n->dst.current_dstc = NVME_DST_OPERATION_COMPLETED;
+return NVME_SUCCESS;
+}
+
+static uint16_t nvme_dst(NvmeCtrl *n, NvmeRequest *req)
+{
+uint32_t dw10 = le32_to_cpu(req->cmd.cdw10);
+uint32_t nsid = le32_to_cpu(req->cmd.nsid);
+uint8_t stc = dw10 & 0xf;
+
+trace_pci_nvme_dst(nvme_cid(req), nsid, stc);
+
+if (!nvme_nsid_valid(n, nsid) && nsid != 0) {
+return NVME_INVALID_NSID | NVME_DNR;
+}
+
+if (nsid != NVME_NSID_BROADCAST && nsid != 0 &&
+!nvme_ns(n, nsid)) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+return nvme_dst_processing(n, nsid, stc);
+}
+
 static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
 {
 trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcode,
@@ -5109,6 +5207,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_ns_attachment(n, req);
 case NVME_ADM_CMD_FORMAT_NVM:
 return nvme_format(n, req);
+case NVME_ADM_CMD_DST:
+return nvme_dst(n, 

Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)

2021-03-31 Thread Vladimir Sementsov-Ogievskiy

30.03.2021 19:39, Max Reitz wrote:

==

OK, I think now that you didn't mean qcow2_alloc_clusters(). So, we are saying about only 
functions returning an offset to cluster with "guest data", not to any kind of 
host cluster. Than what you propose looks like this to me:

  - take my v5
  - rename qcow2_inflight_writes_dec() to put_cluster_offset()
  - call qcow2_inflight_writes_inc() from the three functions you mention


Yes, I think so.  Or you take the CoRwLock in those three functions, depending 
on which implementation we want.


CoRwLock brings one inconvenient feature: to pass ownership on the reference to 
coroutine (for example to task coroutine in qcow2_co_pwritev) we have to unlock 
the lock in original coroutine and lock in another, as CoRwLock doesn't support 
transferring to other coroutine. So we'll do put_cluster_offset() in 
qcow2_co_pwritev() and do get_cluster_offset() in qcow2_co_pwritev_task(). (and 
we must be sure that the second lock taken before releasing the first one). So 
in this case it probably better to keep direct CoRwLock interface, not 
shadowing it by get/put.

--
Best regards,
Vladimir