date:20190109

With the experimental x-nbd-server-add-bitmap command, there was
a window of time where an NBD client could see the export but not
the associated dirty bitmap, which can cause a client that planned
on using the dirty bitmap to be forced to treat the entire image
as dirty as a safety fallback.  Furthermore, if the QMP client
successfully exports a disk but then fails to add the bitmap, it
has to take on the burden of removing the export.  Since we don't
allow changing the exposed dirty bitmap (whether to a different
bitmap, or removing advertisement of the bitmap), it is nicer to
make the bitmap tied to the export at the time the export is
created, with automatic failure to export if the bitmap is not
available.

The experimental command included an optional 'bitmap-export-name'
field for remapping the name exposed over NBD to be different from
the bitmap name stored on disk.  However, my libvirt demo code
for implementing differential backups on top of persistent bitmaps
did not need to take advantage of that feature (it is instead
possible to create a new temporary bitmap with the desired name,
use block-dirty-bitmap-merge to merge one or more persistent
bitmaps into the temporary, then associate the temporary with the
NBD export, if control is needed over the exported bitmap name).
Hence, I'm not copying that part of the experiment over to the
stable addition. For more details on the libvirt demo, see
https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html,
https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat

This patch focuses on the user interface, and reduces (but does
not completely eliminate) the window where an NBD client can see
the export but not the dirty bitmap.  Later patches will add
further cleanups now that this interface is declared stable via a
single QMP command, including removing the race window.

Update test 223 to use the new interface, including a demonstration
that it is now easier to handle failures.

Signed-off-by: Eric Blake 
---
 qapi/block.json|  7 ++-
 blockdev-nbd.c | 12 +++-
 hmp.c  |  5 +++--
 tests/qemu-iotests/223 | 17 ++---
 tests/qemu-iotests/223.out |  5 +
 5 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index 11f01f28efe..3d70420f763 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -246,6 +246,10 @@
 #
 # @writable: Whether clients should be able to write to the device via the
 # NBD connection (default false).
+
+# @bitmap: Also export the dirty bitmap reachable from @device, so the
+#  NBD client can use NBD_OPT_SET_META_CONTEXT with
+#  "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
 #
 # Returns: error if the server is not running, or export with the same name
 #  already exists.
@@ -253,7 +257,8 @@
 # Since: 1.3.0
 ##
 { 'command': 'nbd-server-add',
-  'data': {'device': 'str', '*name': 'str', '*writable': 'bool'} }
+  'data': {'device': 'str', '*name': 'str', '*writable': 'bool',
+   '*bitmap': 'str' } }

 ##
 # @NbdServerRemoveMode:
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index f5edbc27d88..ac7e993c35f 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -140,7 +140,8 @@ void qmp_nbd_server_start(SocketAddressLegacy *addr,
 }

 void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
-bool has_writable, bool writable, Error **errp)
+bool has_writable, bool writable,
+bool has_bitmap, const char *bitmap, Error **errp)
 {
 BlockDriverState *bs = NULL;
 BlockBackend *on_eject_blk;
@@ -185,6 +186,15 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
 nbd_export_put(exp);
+
+if (has_bitmap) {
+Error *err = NULL;
+nbd_export_bitmap(exp, bitmap, bitmap, &err);
+if (err) {
+error_propagate(errp, err);
+nbd_export_remove(exp, NBD_SERVER_REMOVE_MODE_HARD, NULL);
+}
+}
 }

 void qmp_nbd_server_remove(const char *name,
diff --git a/hmp.c b/hmp.c
index 80aa5ab504b..8da5fd8760a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2326,7 +2326,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict 
*qdict)
 }

 qmp_nbd_server_add(info->value->device, false, NULL,
-   true, writable, &local_err);
+   true, writable, false, NULL, &local_err);

 if (local_err != NULL) {
 qmp_nbd_server_stop(NULL);
@@ -2347,7 +2347,8 @@ void hmp_nbd_server_add(Monitor *mon, const QDict *qdict)
 bool writable = qdict_get_try_bool(qdict, "writable", false);
 Error *local_err = NULL;

-qmp_nbd_server_add(device, !!name, name, true, writable, &local_err);
+qmp

[Qemu-devel] [PATCH v2 1/6] nbd: Only require disabled bitmap for read-only exports

Our initial implementation of x-nbd-server-add-bitmap put
in a restriction because of incremental backups: in that
usage, we are exporting one qcow2 file (the temporary overlay
target of a blockdev-backup sync:none job) and a dirty bitmap
owned by a second qcow2 file (the source of the
blockdev-backup, which is the backing file of the temporary).
While both qcow2 files are still writable (the target capture
copy-on-write of old contents, the source to track live guest
writes in the meantime), the NBD client expects to see
constant data, including the dirty bitmap.  An enabled bitmap
in the source would be modified by guest writes, which is at
odds with the NBD export being a read-only constant view,
hence the initial code choice of enforcing a disabled bitmap
(the intent is that the exposed bitmap was disabled in the
same transaction that started the blockdev-backup job,
although we don't want to actually track enough state to
actually enforce that).

However, in the case of image migration, where we WANT a
writable export, it makes total sense that the NBD client
could expect writes to change the status visible through
the exposed dirty bitmap, and thus permitting an enabled
bitmap for an export that is not read-only makes sense;
although the current restriction prevents that usage.

Alternatively, consider the case when the active layer does
not contain the bitmap but the backing layer does.  If the
backing layer is read-only (which is different from the
backing layer of a blockdev-backup job being writable),
we know the backing layer cannot be modified, and thus the
bitmap will not change contents even if it is still marked
enabled (for that matter, since the backing file is
read-only, we couldn't change the bitmap to be disabled
in the first place).  Again, the current restriction
prevents this usage.

Solve both issues by gating the restriction against a
disabled bitmap to only happen when the caller has
requested a read-only export, and where the BDS that owns
the bitmap (which might be a backing file of the BDS
handed to nbd_export_new) is still writable.

Update iotest 223 to add some tests of the error paths.

Signed-off-by: Eric Blake 

---
v2: new patch
---
 nbd/server.c   |  7 +--
 tests/qemu-iotests/223 | 14 --
 tests/qemu-iotests/223.out |  4 
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 7af0ddffb20..98327088cb4 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2456,8 +2456,11 @@ void nbd_export_bitmap(NBDExport *exp, const char 
*bitmap,
 return;
 }

-if (bdrv_dirty_bitmap_enabled(bm)) {
-error_setg(errp, "Bitmap '%s' is enabled", bitmap);
+if ((exp->nbdflags & NBD_FLAG_READ_ONLY) && bdrv_is_writable(bs) &&
+bdrv_dirty_bitmap_enabled(bm)) {
+error_setg(errp,
+   "Enabled bitmap '%s' incompatible with readonly export",
+   bitmap);
 return;
 }

diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index 5513dc62159..f1fbb9bc1c6 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -61,6 +61,8 @@ echo "=== Create partially sparse image, then add dirty 
bitmaps ==="
 echo

 # Two bitmaps, to contrast granularity issues
+# Also note that b will be disabled, while b2 is left enabled, to
+# check for read-only interactions
 _make_test_img -o cluster_size=4k 4M
 $QEMU_IO -c 'w -P 0x11 1M 2M' "$TEST_IMG" | _filter_qemu_io
 run_qemu <

[Qemu-devel] [PATCH v2 2/6] nbd: Merge nbd_export_set_name into nbd_export_new

The existing NBD code had a weird split where nbd_export_new()
created an export but did not add it to the list of exported
names until a later nbd_export_set_name() came along and grabbed
a second reference on the object; later, nbd_export_close()
drops the second reference.  But since we never change the
name of an NBD export while it is exposed, it is easier to just
inline the process of setting the name as part of creating the
export.

Inline the contents of nbd_export_set_name() and
nbd_export_set_description() into the two points in an export
lifecycle where they matter, then adjust both callers to pass
the name up front.  Note that all callers pass a non-NULL name,
(passing NULL at creation was for old style servers, but we
removed support for that in commit 7f7dfe2a).

Signed-off-by: Eric Blake 
---
 include/block/nbd.h |  3 +--
 blockdev-nbd.c  |  5 ++---
 nbd/server.c| 45 -
 qemu-nbd.c  |  5 ++---
 4 files changed, 21 insertions(+), 37 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 65402d33964..2f9a2aeb73c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -295,6 +295,7 @@ typedef struct NBDExport NBDExport;
 typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
+  const char *name, const char *description,
   uint16_t nbdflags, void (*close)(NBDExport *),
   bool writethrough, BlockBackend *on_eject_blk,
   Error **errp);
@@ -306,8 +307,6 @@ void nbd_export_put(NBDExport *exp);
 BlockBackend *nbd_export_get_blockdev(NBDExport *exp);

 NBDExport *nbd_export_find(const char *name);
-void nbd_export_set_name(NBDExport *exp, const char *name);
-void nbd_export_set_description(NBDExport *exp, const char *description);
 void nbd_export_close_all(void);

 void nbd_client_new(QIOChannelSocket *sioc,
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 1d170c80b82..f5edbc27d88 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -174,14 +174,13 @@ void qmp_nbd_server_add(const char *device, bool 
has_name, const char *name,
 writable = false;
 }

-exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
+exp = nbd_export_new(bs, 0, -1, name, NULL,
+ writable ? 0 : NBD_FLAG_READ_ONLY,
  NULL, false, on_eject_blk, errp);
 if (!exp) {
 return;
 }

-nbd_export_set_name(exp, name);
-
 /* The list of named exports has a strong reference to this export now and
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
diff --git a/nbd/server.c b/nbd/server.c
index 98327088cb4..676fb4886d0 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1456,6 +1456,7 @@ static void nbd_eject_notifier(Notifier *n, void *data)
 }

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
+  const char *name, const char *description,
   uint16_t nbdflags, void (*close)(NBDExport *),
   bool writethrough, BlockBackend *on_eject_blk,
   Error **errp)
@@ -1471,6 +1472,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
  * that BDRV_O_INACTIVE is cleared and the image is ready for write
  * access since the export could be available before migration handover.
  */
+assert(name);
 ctx = bdrv_get_aio_context(bs);
 aio_context_acquire(ctx);
 bdrv_invalidate_cache(bs, NULL);
@@ -1494,6 +1496,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
 QTAILQ_INIT(&exp->clients);
 exp->blk = blk;
 exp->dev_offset = dev_offset;
+exp->name = g_strdup(name);
+exp->description = g_strdup(description);
 exp->nbdflags = nbdflags;
 exp->size = size < 0 ? blk_getlength(blk) : size;
 if (exp->size < 0) {
@@ -1513,10 +1517,14 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
 exp->eject_notifier.notify = nbd_eject_notifier;
 blk_add_remove_bs_notifier(on_eject_blk, &exp->eject_notifier);
 }
+QTAILQ_INSERT_TAIL(&exports, exp, next);
+nbd_export_get(exp);
 return exp;

 fail:
 blk_unref(blk);
+g_free(exp->name);
+g_free(exp->description);
 g_free(exp);
 return NULL;
 }
@@ -1533,33 +1541,6 @@ NBDExport *nbd_export_find(const char *name)
 return NULL;
 }

-void nbd_export_set_name(NBDExport *exp, const char *name)
-{
-if (exp->name == name) {
-return;
-}
-
-nbd_export_get(exp);
-if (exp->name != NULL) {
-g_free(exp->name);
-exp->name = NULL;
-QTAILQ_REMOVE(&exports, exp, next);
-nbd_export_put(exp);
-}
-if (name != NULL) {
-nbd_export_get(exp);

[Qemu-devel] [PATCH v2 4/6] nbd: Remove x-nbd-server-add-bitmap

Now that nbd-server-add can do the same functionality, we no
longer need the experimental separate command.

Signed-off-by: Eric Blake 
---
 qapi/block.json | 23 ---
 blockdev-nbd.c  | 23 ---
 2 files changed, 46 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index 3d70420f763..5a79d639e8c 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -301,29 +301,6 @@
 { 'command': 'nbd-server-remove',
   'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} }

-##
-# @x-nbd-server-add-bitmap:
-#
-# Expose a dirty bitmap associated with the selected export. The bitmap search
-# starts at the device attached to the export, and includes all backing files.
-# The exported bitmap is then locked until the NBD export is removed.
-#
-# @name: Export name.
-#
-# @bitmap: Bitmap name to search for.
-#
-# @bitmap-export-name: How the bitmap will be seen by nbd clients
-#  (default @bitmap)
-#
-# Note: the client must use NBD_OPT_SET_META_CONTEXT with a query of
-# "qemu:dirty-bitmap:NAME" (where NAME matches @bitmap-export-name) to access
-# the exposed bitmap.
-#
-# Since: 3.0
-##
-  { 'command': 'x-nbd-server-add-bitmap',
-'data': {'name': 'str', 'bitmap': 'str', '*bitmap-export-name': 'str'} }
-
 ##
 # @nbd-server-stop:
 #
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index ac7e993c35f..003ba7d7180 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -228,26 +228,3 @@ void qmp_nbd_server_stop(Error **errp)
 nbd_server_free(nbd_server);
 nbd_server = NULL;
 }
-
-void qmp_x_nbd_server_add_bitmap(const char *name, const char *bitmap,
- bool has_bitmap_export_name,
- const char *bitmap_export_name,
- Error **errp)
-{
-NBDExport *exp;
-
-if (!nbd_server) {
-error_setg(errp, "NBD server not running");
-return;
-}
-
-exp = nbd_export_find(name);
-if (exp == NULL) {
-error_setg(errp, "Export '%s' is not found", name);
-return;
-}
-
-nbd_export_bitmap(exp, bitmap,
-  has_bitmap_export_name ? bitmap_export_name : bitmap,
-  errp);
-}
-- 
2.20.1

Re: [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export

On 1/9/19 1:21 PM, Eric Blake wrote:
> Revisiting an older thread:
> 
> On 6/9/18 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Handle new NBD meta namespace: "qemu", and corresponding queries:
>> "qemu:dirty-bitmap:".
>>
>> With new metadata context negotiated, BLOCK_STATUS query will reply
>> with dirty-bitmap data, converted to extents. New public function
>> nbd_export_bitmap selects bitmap to export. For now, only one bitmap
>> may be exported.
>>

>> +if (bdrv_dirty_bitmap_enabled(bm)) {
>> +error_setg(errp, "Bitmap '%s' is enabled", bitmap);
>> +return;
>> +}
> 
> Why are we restricting things to only export disabled bitmaps?
> 
> I can understand the argument that if the image being exported is
> read-only, then an enabled bitmap _that can be changed_ is probably a
> bad idea (it goes against the notion of the export being read only).
> But if we were to allow a writable access to an image, wouldn't we
> expect that writes be reflected into the bitmap, which means permitting
> an enabled bitmap?

I've now addressed this in my v2 Promote x-nbd-server-add-bitmap to
stable series.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 5/6] nbd: Merge nbd_export_bitmap into nbd_export_new

We only have one caller that wants to export a bitmap name,
which it does right after creation of the export. But there is
still a brief window of time where an NBD client could see the
export but not the dirty bitmap, which a robust client would
have to interpret as meaning the entire image should be treated
as dirty.  Better is to eliminate the window entirely, by
inlining nbd_export_bitmap() into nbd_export_new(), and refusing
to create the bitmap in the first place if the requested bitmap
can't be located.

We also no longer need logic for setting a different bitmap
name compared to the bitmap being exported.

Signed-off-by: Eric Blake 
---
 include/block/nbd.h |  9 ++---
 blockdev-nbd.c  | 11 +-
 nbd/server.c| 87 +
 qemu-nbd.c  |  4 +--
 4 files changed, 46 insertions(+), 65 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2f9a2aeb73c..1971b557896 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -296,9 +296,9 @@ typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
   const char *name, const char *description,
-  uint16_t nbdflags, void (*close)(NBDExport *),
-  bool writethrough, BlockBackend *on_eject_blk,
-  Error **errp);
+  const char *bitmap, uint16_t nbdflags,
+  void (*close)(NBDExport *), bool writethrough,
+  BlockBackend *on_eject_blk, Error **errp);
 void nbd_export_close(NBDExport *exp);
 void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error **errp);
 void nbd_export_get(NBDExport *exp);
@@ -319,9 +319,6 @@ void nbd_client_put(NBDClient *client);
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
   Error **errp);

-void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
-   const char *bitmap_export_name, Error **errp);
-
 /* nbd_read
  * Reads @size bytes from @ioc. Returns 0 on success.
  */
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 003ba7d7180..0df6307be2d 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -175,7 +175,7 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 writable = false;
 }

-exp = nbd_export_new(bs, 0, -1, name, NULL,
+exp = nbd_export_new(bs, 0, -1, name, NULL, bitmap,
  writable ? 0 : NBD_FLAG_READ_ONLY,
  NULL, false, on_eject_blk, errp);
 if (!exp) {
@@ -186,15 +186,6 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
  * the strong reference that is @exp. */
 nbd_export_put(exp);
-
-if (has_bitmap) {
-Error *err = NULL;
-nbd_export_bitmap(exp, bitmap, bitmap, &err);
-if (err) {
-error_propagate(errp, err);
-nbd_export_remove(exp, NBD_SERVER_REMOVE_MODE_HARD, NULL);
-}
-}
 }

 void qmp_nbd_server_remove(const char *name,
diff --git a/nbd/server.c b/nbd/server.c
index 676fb4886d0..e613594fde0 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1457,9 +1457,9 @@ static void nbd_eject_notifier(Notifier *n, void *data)

 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size,
   const char *name, const char *description,
-  uint16_t nbdflags, void (*close)(NBDExport *),
-  bool writethrough, BlockBackend *on_eject_blk,
-  Error **errp)
+  const char *bitmap, uint16_t nbdflags,
+  void (*close)(NBDExport *), bool writethrough,
+  BlockBackend *on_eject_blk, Error **errp)
 {
 AioContext *ctx;
 BlockBackend *blk;
@@ -1507,6 +1507,43 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
 }
 exp->size -= exp->size % BDRV_SECTOR_SIZE;

+if (bitmap) {
+BdrvDirtyBitmap *bm = NULL;
+BlockDriverState *bs = blk_bs(blk);
+
+while (true) {
+bm = bdrv_find_dirty_bitmap(bs, bitmap);
+if (bm != NULL || bs->backing == NULL) {
+break;
+}
+
+bs = bs->backing->bs;
+}
+
+if (bm == NULL) {
+error_setg(errp, "Bitmap '%s' is not found", bitmap);
+goto fail;
+}
+
+if ((nbdflags & NBD_FLAG_READ_ONLY) && bdrv_is_writable(bs) &&
+bdrv_dirty_bitmap_enabled(bm)) {
+error_setg(errp,
+   "Enabled bitmap '%s' incompatible with readonly export",
+   bitmap);
+goto fail;
+}
+
+if (bdrv_dirty_bitmap_user_locked(bm)) {
+error_setg(e

[Qemu-devel] [PATCH v2 0/6] Promote x-nbd-server-add-bitmap to stable

Or rather, move its functionality into nbd-server-add.  And as
a side effect, teach qemu-nbd how to export a persistent bitmap
without having to go through a qemu process and several QMP
commands.

Based-on: <20181221093529.23855-1-js...@redhat.com>
[0/11 bitmaps: remove x- prefix from QMP api]

Available at: https://repo.or.cz/qemu/ericb.git nbd-bitmap-add-v2

Since v1:
- add new patch 1 allowing bitmaps with writable exports
- add coverage in iotest 223
- fix logic bug that rendered qemu-nbd -B useless
- drop support for bitmap-export-name remapping

001/6:[down] 'nbd: Only require disabled bitmap for read-only exports'
002/6:[] [--] 'nbd: Merge nbd_export_set_name into nbd_export_new'
003/6:[0056] [FC] 'nbd: Allow bitmap export during QMP nbd-server-add'
004/6:[] [--] 'nbd: Remove x-nbd-server-add-bitmap'
005/6:[0042] [FC] 'nbd: Merge nbd_export_bitmap into nbd_export_new'
006/6:[0041] [FC] 'qemu-nbd: Add --bitmap=NAME option'

Eric Blake (6):
  nbd: Only require disabled bitmap for read-only exports
  nbd: Merge nbd_export_set_name into nbd_export_new
  nbd: Allow bitmap export during QMP nbd-server-add
  nbd: Remove x-nbd-server-add-bitmap
  nbd: Merge nbd_export_bitmap into nbd_export_new
  qemu-nbd: Add --bitmap=NAME option

 qemu-nbd.texi  |   4 ++
 qapi/block.json|  30 ++---
 include/block/nbd.h|  12 ++--
 blockdev-nbd.c |  31 ++---
 hmp.c  |   5 +-
 nbd/server.c   | 129 -
 qemu-nbd.c |  15 +++--
 tests/qemu-iotests/223 |  39 ---
 tests/qemu-iotests/223.out |  19 --
 9 files changed, 132 insertions(+), 152 deletions(-)

-- 
2.20.1

[Qemu-devel] [PATCH v2 6/6] qemu-nbd: Add --bitmap=NAME option

Having to fire up qemu, then use QMP commands for nbd-server-start
and nbd-server-add, just to expose a persistent dirty bitmap, is
rather tedious.  Make it possible to expose a dirty bitmap using
just qemu-nbd (of course, for now this only works when qemu-nbd is
visiting a BDS formatted as qcow2).

Of course, any good feature also needs unit testing, so expand
iotest 223 to cover it.

Signed-off-by: Eric Blake 
---
 qemu-nbd.texi  |  4 
 qemu-nbd.c | 10 --
 tests/qemu-iotests/223 | 18 +-
 tests/qemu-iotests/223.out | 12 +++-
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 9a84e81eed9..96b1546006a 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -45,6 +45,10 @@ auto-detecting
 Export the disk as read-only
 @item -P, --partition=@var{num}
 Only expose partition @var{num}
+@item -B, --bitmap=@var{name}
+If @var{filename} has a qcow2 persistent bitmap @var{name}, expose
+that bitmap via the ``qemu:dirty-bitmap:@var{name}'' context
+accessible through NBD_OPT_SET_META_CONTEXT.
 @item -s, --snapshot
 Use @var{filename} as an external snapshot, create a temporary
 file with backing_file=@var{filename}, redirect the write to
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 03813187846..26f5c08821d 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -95,6 +95,7 @@ static void usage(const char *name)
 "Exposing part of the image:\n"
 "  -o, --offset=OFFSET   offset into the image\n"
 "  -P, --partition=NUM   only expose partition NUM\n"
+"  -B, --bitmap=NAME expose a persistent dirty bitmap\n"
 "\n"
 "General purpose options:\n"
 "  --object type,id=ID,...   define an object such as 'secret' for providing\n"
@@ -509,7 +510,7 @@ int main(int argc, char **argv)
 off_t fd_size;
 QemuOpts *sn_opts = NULL;
 const char *sn_id_or_name = NULL;
-const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:D:";
+const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:D:B:";
 struct option lopt[] = {
 { "help", no_argument, NULL, 'h' },
 { "version", no_argument, NULL, 'V' },
@@ -519,6 +520,7 @@ int main(int argc, char **argv)
 { "offset", required_argument, NULL, 'o' },
 { "read-only", no_argument, NULL, 'r' },
 { "partition", required_argument, NULL, 'P' },
+{ "bitmap", required_argument, NULL, 'B' },
 { "connect", required_argument, NULL, 'c' },
 { "disconnect", no_argument, NULL, 'd' },
 { "snapshot", no_argument, NULL, 's' },
@@ -558,6 +560,7 @@ int main(int argc, char **argv)
 QDict *options = NULL;
 const char *export_name = ""; /* Default export name */
 const char *export_description = NULL;
+const char *bitmap = NULL;
 const char *tlscredsid = NULL;
 bool imageOpts = false;
 bool writethrough = true;
@@ -695,6 +698,9 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 break;
+case 'B':
+bitmap = optarg;
+break;
 case 'k':
 sockpath = optarg;
 if (sockpath[0] != '/') {
@@ -1016,7 +1022,7 @@ int main(int argc, char **argv)
 }

 exp = nbd_export_new(bs, dev_offset, fd_size, export_name,
- export_description, NULL, nbdflags,
+ export_description, bitmap, nbdflags,
  nbd_export_closed, writethrough, NULL, &error_fatal);

 if (device) {
diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index 1f6822f9489..d7c4a31a181 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -25,6 +25,7 @@ status=1 # failure is the default!

 _cleanup()
 {
+nbd_server_stop
 _cleanup_test_img
 _cleanup_qemu
 rm -f "$TEST_DIR/nbd"
@@ -35,6 +36,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.rc
 . ./common.filter
 . ./common.qemu
+. ./common.nbd

 _supported_fmt qcow2
 _supported_proto file # uses NBD as well
@@ -153,7 +155,7 @@ $QEMU_IMG map --output=json --image-opts \
   "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b2" | _filter_qemu_img_map

 echo
-echo "=== End NBD server ==="
+echo "=== End qemu NBD server ==="
 echo

 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-remove",
@@ -165,6 +167,20 @@ _send_qemu_cmd $QEMU_HANDLE 
'{"execute":"nbd-server-remove",
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"

+echo
+echo "=== Use qemu-nbd as server ==="
+echo
+
+nbd_server_start_unix_socket -r -f $IMGFMT -B b "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_IMG map --output=json --image-opts \
+  "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b" | _filter_qemu_img_map
+
+nbd_server_start_unix_socket -f $IMGFMT -B b2 "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_IMG map --output=json --image-opts \
+  "$IMG,x-dirty-bitmap=qemu:dirty-bitma

Re: [Qemu-devel] [PATCH] target-i386: hvf: remove MPX support

2019-01-09 Thread Roman Bolshakov

On Thu, Dec 20, 2018 at 01:11:12PM +0100, Paolo Bonzini wrote:
> MPX support is being phased out by Intel and actually I am not sure that
> OS X has ever enabled it in XCR0.  Drop it from the Hypervisor.framework
> acceleration.
> 

I also doubt if OS X enabled it, but I can't confirm that as I have only
Ivy Bridge and Haswell-based laptops.

Reviewed-by: Roman Bolshakov 

Thanks,
Roman

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-09 Thread Rik van Riel

On Thu, 2019-01-10 at 12:26 +1100, Dave Chinner wrote:
> On Wed, Jan 09, 2019 at 08:17:31PM +0530, Pankaj Gupta wrote:
> >  This patch series has implementation for "virtio pmem". 
> >  "virtio pmem" is fake persistent memory(nvdimm) in guest 
> >  which allows to bypass the guest page cache. This also
> >  implements a VIRTIO based asynchronous flush mechanism.  
> 
> H. Sharing the host page cache direct into the guest VM. Sounds
> like a good idea, but.
> 
> This means the guest VM can now run timing attacks to observe host
> side page cache residency, and depending on the implementation I'm
> guessing that the guest will be able to control host side page
> cache eviction, too (e.g. via discard or hole punch operations).

Even if the guest can only access its own disk image,
using its own page cache pages, Pankaj's patches could
still be a win because they allow the host kernel to
easily evict page cache pages without doing IO.

Evicting page cache pages that live inside a guest
involves something like swapping, ballooning, or some
sort of callback into the guest to figure out whether
a page is clean.

Evicting page cache pages that live in the host, OTOH...

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part

Re: [Qemu-devel] [PATCH v3 2/2] configure: Force the C standard to gnu99

2019-01-09 Thread Thomas Huth

On 2019-01-09 22:26, Paolo Bonzini wrote:
> On 09/01/19 19:10, Philippe Mathieu-Daudé wrote:
>> Using '-std=gnu++98' for g++ v4.8 looks like a good compromise to the
>> issues Daniel mentioned (still 'experimental').
>>
>> Reviewed-by: Philippe Mathieu-Daudé 
> 
> C++11 has many new features that almost make it an entirely different
> language (the main being rvalue references, and type inference with auto
> and decltype).  If it works for GCC 4.8, I would prefer using that
> instead of g++98.

I think we can revisit that setting when we really want to introduce
code that needs it. Currently it does not seem to be necessary, so that
does not justify to use an "experimental" feature of GCC 4.8, I guess.

 Thomas

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

2019-01-09 Thread Thomas Huth

On 2019-01-09 18:47, Greg Kurz wrote:
> On Wed, 9 Jan 2019 18:40:48 +0100
> Cédric Le Goater  wrote:
> 
>> On 1/9/19 6:28 PM, Daniel P. Berrangé wrote:
>>> On Wed, Jan 09, 2019 at 06:13:01PM +0100, Cédric Le Goater wrote:  
 On 1/9/19 5:39 PM, Thomas Huth wrote:  
> When compiling with Clang and -std=gnu99, I get the following errors:
>
>   CC  ppc64-softmmu/hw/intc/xics_spapr.o
> In file included from hw/intc/xics_spapr.c:34:
> include/hw/ppc/xics.h:46:25: error: redefinition of typedef 'ICSState' is 
> a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct ICSState ICSState;
> ^
> include/hw/ppc/spapr.h:18:25: note: previous definition is here
> typedef struct ICSState ICSState;
> ^
> In file included from hw/intc/xics_spapr.c:34:
> include/hw/ppc/xics.h:203:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
>   CC  ppc64-softmmu/hw/intc/spapr_xive.o
> In file included from hw/intc/spapr_xive.c:19:
> include/hw/ppc/spapr_xive.h:38:3: error: redefinition of typedef 
> 'sPAPRXive' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> } sPAPRXive;
>   ^
> include/hw/ppc/spapr.h:20:26: note: previous definition is here
> typedef struct sPAPRXive sPAPRXive;
>  ^
> In file included from hw/intc/spapr_xive.c:19:
> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
>   CC  ppc64-softmmu/hw/char/spapr_vty.o
> In file included from hw/char/spapr_vty.c:8:
> In file included from include/hw/ppc/spapr.h:12:
> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
>
> Since we're going to make -std=gnu99 mandatory, fix these issues
> by including the right header files indead of typedeffing stuff twice.
>
> Signed-off-by: Thomas Huth 
> ---
>  include/hw/ppc/spapr.h  | 4 ++--
>  include/hw/ppc/spapr_xive.h | 2 +-
>  include/hw/ppc/xics.h   | 4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 2c77a8b..6a5ae4f 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -8,6 +8,8 @@
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/ppc/spapr_ovec.h"
>  #include "hw/ppc/spapr_irq.h"
> +#include "hw/ppc/xics.h"
> +#include "hw/ppc/spapr_xive.h"
>  
>  struct VIOsPAPRBus;
>  struct sPAPRPHBState;
> @@ -15,8 +17,6 @@ struct sPAPRNVRAM;
>  typedef struct sPAPREventLogEntry sPAPREventLogEntry;
>  typedef struct sPAPREventSource sPAPREventSource;
>  typedef struct sPAPRPendingHPT sPAPRPendingHPT;
> -typedef struct ICSState ICSState;
> -typedef struct sPAPRXive sPAPRXive;
>  
>  #define HPTE64_V_HPTE_DIRTY 0x0040ULL
>  #define SPAPR_ENTRY_POINT   0x100
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 728735d..aff4366 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -42,7 +42,7 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t 
> lisn);
>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>  qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>  
> -typedef struct sPAPRMachineState sPAPRMachineState;
> +#include "hw/ppc/spapr.h"  /* for sPAPRMachineState */  

 so both files include each other, how nice ...  
>>>
>>> If the header files are mutually dependent it makes me wonder what the
>>> point of having them split up is ?  
>>
>> The XICS interrupt controller models are used in two different machines,
>> sPAPR and PowerNV.
>>
>>> Feels like either they need to be merged, or they need to be split up
>>> and refactored even more to remove the m

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

2019-01-09 Thread Thomas Huth

On 2019-01-09 22:20, Paolo Bonzini wrote:
> On 09/01/19 18:28, Daniel P. Berrangé wrote:
>>> so both files include each other, how nice ...
>> If the header files are mutually dependent it makes me wonder what the
>> point of having them split up is ?
>>
>> Feels like either they need to be merged, or they need to be split up
>> and refactored even more to remove the mutual dependancy.
> 
> If they include each other only for the typedefs, then prehaps the
> solution is to change the coding style and allow using struct in
> function prototypes.  I'm pretty sure there are several examples of this
> already.

I'm all in favor for this!

 Thomas

Re: [Qemu-devel] [PATCH] target/arm: Allow to switch from MON->HYP on AArch32

2019-01-09 Thread Alexander Graf




On 10.01.19 00:08, Peter Maydell wrote:
> On Wed, 9 Jan 2019 at 17:14, Alexander Graf  wrote:
>>
>> On 01/09/2019 05:59 PM, Peter Maydell wrote:
>>> On Wed, 9 Jan 2019 at 16:52, Peter Maydell  wrote:
 On Wed, 9 Jan 2019 at 15:26, Alexander Graf  wrote:
> In U-boot, we switch from S-SVC -> MON -> HYP when we want to enter
> HYP mode. This dance seems to work ok (hence it's there in the code
> base), but breaks with current QEMU.
>>> PS: it would be helpful if the commit message said how u-boot
>>> is trying to go from Mon to Hyp -- some ways to try to do
>>> this are OK, and some are not, so whether it's supposed to
>>> work or not depends on what u-boot is actually doing...
>>
>> I don't fully understand all of it to be honest :). But the code is here:
>>
>> http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/cpu/armv7/nonsec_virt.S
>>
>> What I managed to understand so far is that it goes to MON using the smc
>> #0 call and then changes SPSR so that on return (movs pc) the mode will
>> be different.
> 
> Thanks -- yes, that's an exception return so it's the
> expected way to go from Mon to Hyp.

That was my understanding, yes. Do you still want me to change the
commit message to mention that or will you just do it when applying?

Thanks,

Alex

Re: [Qemu-devel] [PATCH 2/5] nbd: Allow bitmap export during QMP nbd-server-add

On 1/8/19 10:14 PM, Eric Blake wrote:
> With the experimental x-nbd-server-add-bitmap command, there was
> a window of time where a client could see the export but not the
> associated dirty bitmap, which can cause a client that planned
> on using the dirty bitmap to be forced to treat the entire image
> as dirty as a safety fallback.  Furthermore, if the QMP client
> successfully exports a disk but then fails to add the bitmap, it
> has to take on the burden of removing the export.  Since we don't
> allow changing a dirty bitmap once it is exported (whether to a
> different bitmap, or removing advertisement of the bitmap), it is
> nicer to make the bitmap tied to the export at the time the export
> is created, and automatically failing to export if the bitmap is
> not available.
> 
> Since there is working libvirt demo code that uses both the bitmap
> export and the ability to specify an alternative name (rather than
> exposing the private-use bitmap that libvirt created to merge in
> several persistent bitmaps when doing a differential backup), the
> two new parameters do not need to be marked experimental.  See
> https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html,
> https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat

...

> +#
> +# @bitmap-export-name: How the bitmap will be seen by nbd clients
> +#  (default @bitmap). The NBD client must use
> +#  NBD_OPT_SET_META_CONTEXT with "qemu:dirty-bitmap:NAME"
> +#  to access the exposed bitmap. (since 4.0)
>  #

Actually, on re-reading my libvirt code, it turns out that I had
differential backups working without the use of bitmap-export-name,
because I always created a temporary bitmap (with the name I planned to
expose, namely "vda" if the bitmap belonged to the device that libvirt
calls "vda") and merged in the persistent bitmaps into the temporary
bitmap.  Of course, right now we only support one backup NBD job at a
time (because we don't support parallel NBD servers), but thinking ahead
to when we may support that, it becomes obvious that libvirt can't pick
the same name for two separate temporary bitmaps being used by two
parallel backup jobs.  On the other hand, Nir requested that libvirt
backup job XMLs support some sort of name or UUID, and it is just as
easy to ensure that the temporary bitmap is given the name of the backup
job (which will be unique) rather than the name of the device being
backed up.

And my proposed patches for 'qemu-nbd --list' make it easy to
definitively query what names are actually being exported.

So at this point, I'm leaning towards NOT exposing a way to remap the
exported bitmap name.  We can always add it later if it proves useful,
but I'd rather not be burdened with being stuck with it since libvirt
hasn't used it so far.  I'll spin v2 of this series along those lines.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] vfio/common: Work around kernel overflow bug in DMA unmap

2019-01-09 Thread Peter Xu

On Wed, Jan 09, 2019 at 04:10:51PM -0700, Alex Williamson wrote:
> A kernel bug was introduced in v4.15 via commit 71a7d3d78e3c which
> adds a test for address space wrap-around in the vfio DMA unmap path.
> Unfortunately due to overflow, the kernel detects an unmap of the last
> page in the 64-bit address space as a wrap-around.  In QEMU, a Q35
> guest with VT-d emulation and guest IOMMU enabled will attempt to make
> such an unmap request during VM system reset, triggering an error:
> 
>   qemu-kvm: VFIO_UNMAP_DMA: -22
>   qemu-kvm: vfio_dma_unmap(0x561f059948f0, 0xfef0, 0x0110) = 
> -22 (Invalid argument)
> 
> Here the IOVA start address (0xfef0) and the size parameter
> (0x0110) add to exactly 2^64, triggering the bug.  A
> kernel fix is queued for the Linux v5.0 release to address this.
> 
> This patch implements a workaround to retry the unmap, excluding the
> final page of the range when we detect an unmap failing which matches
> the requirements for this issue.  This is expected to be a safe and
> complete workaround as the VT-d address space does not extend to the
> full 64-bit space and therefore the last page should never be mapped.
> 
> This workaround can be removed once all kernels with this bug are
> sufficiently deprecated.
> 
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
> Reported-by: Pei Zhang 
> Debugged-by: Peter Xu 
> Signed-off-by: Alex Williamson 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] ppc/xive: fix remaining XiveFabric names

2019-01-09 Thread David Gibson

On Wed, Jan 09, 2019 at 04:15:32PM +0100, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater 

Applied, thanks.

> ---
>  include/hw/ppc/xive.h | 2 +-
>  hw/intc/xive.c| 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> index 4cd2e54c4958..0f935a146c66 100644
> --- a/include/hw/ppc/xive.h
> +++ b/include/hw/ppc/xive.h
> @@ -146,7 +146,7 @@
>  #include "hw/ppc/xive_regs.h"
>  
>  /*
> - * XIVE Fabric (Interface between Source and Router)
> + * XIVE Notifier (Interface between Source and Router)
>   */
>  
>  typedef struct XiveNotifier {
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index 469c90e50c6a..39aac9f89d2a 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -1659,9 +1659,9 @@ static const TypeInfo xive_end_source_info = {
>  };
>  
>  /*
> - * XIVE Fabric
> + * XIVE Notifier
>   */
> -static const TypeInfo xive_fabric_info = {
> +static const TypeInfo xive_notifier_info = {
>  .name = TYPE_XIVE_NOTIFIER,
>  .parent = TYPE_INTERFACE,
>  .class_size = sizeof(XiveNotifierClass),
> @@ -1670,7 +1670,7 @@ static const TypeInfo xive_fabric_info = {
>  static void xive_register_types(void)
>  {
>  type_register_static(&xive_source_info);
> -type_register_static(&xive_fabric_info);
> +type_register_static(&xive_notifier_info);
>  type_register_static(&xive_router_info);
>  type_register_static(&xive_end_source_info);
>  type_register_static(&xive_tctx_info);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 3/4] nvme: check msix_init_exclusive_bar return value

2019-01-09 Thread Li Qiang

Max Reitz  于2019年1月9日周三 下午10:52写道：

> On 30.10.18 06:18, Li Qiang wrote:
> > As this function can fail.
> >
> > Signed-off-by: Li Qiang 
> > ---
> >  hw/block/nvme.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 72c9644..a406c23 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -1250,7 +1250,10 @@ static void nvme_realize(PCIDevice *pci_dev,
> Error **errp)
> >  pci_register_bar(&n->parent_obj, 0,
> >  PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
> >  &n->iomem);
> > -msix_init_exclusive_bar(&n->parent_obj, n->num_queues, 4, NULL);
> > +if (msix_init_exclusive_bar(&n->parent_obj, n->num_queues, 4,
> NULL)) {
> > +error_setg(errp, "msix_init_exclusive_bar failed");
> > +return;
> > +}
>
> Commit ee640c625e1 which last touched this function stated that creating
> this device won't fail if this function fails, so I suppose it's
> intential that the error isn't checked.
>

OK, got. I will drop this patch later.

Thanks,
Li Qiang


>
>
> But if you think the error should be checked and device creation should
> be aborted if this function failed, then I suppose there'd be some
> cleaning up to instead of just returning.
>
> Also, we should just pass errp to that function (as its last parameter,
> instead of NULL).
>
> Max
>
> >
> >  id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
> >  id->ssvid = cpu_to_le16(pci_get_word(pci_conf +
> PCI_SUBSYSTEM_VENDOR_ID));
> >
>
>
>

Re: [Qemu-devel] [PATCH 2/4] nvme: ensure the num_queues is not zero

2019-01-09 Thread Li Qiang

Max Reitz  于2019年1月9日周三 下午10:38写道：

> On 30.10.18 06:18, Li Qiang wrote:
> > When it is zero, it causes segv. Backtrack:
> > Thread 5 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fffc6c17700 (LWP 51808)]
> > 0x55acbd90 in nvme_start_ctrl (n=0x624c8100) at
> hw/block/nvme.c:820
> > warning: Source file is more recent than executable.
> > 820   if (unlikely(n->cq[0])) {
> > (gdb) bt
> > 0  0x55acbd90 in nvme_start_ctrl (n=0x624c8100) at
> hw/block/nvme.c:820
> > 1  0x55accdbc in nvme_write_bar (n=0x624c8100, offset=20,
> data=4587521, size=4) at hw/block/nvme.c:964
> > 2  0x55acdd2b in nvme_mmio_write (opaque=0x624c8100,
> addr=20, data=4587521, size=4) at hw/block/nvme.c:1158
> > 3  0x558973ed in memory_region_write_accessor
> (mr=0x624c89e0, addr=20, value=0x7fffc6c14428, size=4, shift=0,
> mask=4294967295, attrs=...) at
> /home/liqiang02/qemu-upstream/qemu/memory.c:500
> > 4  0x55897600 in access_with_adjusted_size (addr=20,
> value=0x7fffc6c14428, size=4, access_size_min=2, access_size_max=8,
> access_fn=0x55897304 , mr=0x624c89e0,
> attrs=...) at /home/liqiang02/qemu-upstream/qemu/memory.c:566
> > 5  0x5589a200 in memory_region_dispatch_write
> (mr=0x624c89e0, addr=20, data=4587521, size=4, attrs=...) at
> /home/liqiang02/qemu-upstream/qemu/memory.c:1442
> > 6  0x55835151 in flatview_write_continue (fv=0x606e6fc0,
> addr=4273930260, attrs=..., buf=0x7fffc8a18028 "\001", len=4, addr1=20,
> l=4, mr=0x624c89e0) at /home/liqiang02/qemu-upstream/qemu/exec.c:3233
> > 7  0x5583529b in flatview_write (fv=0x606e6fc0,
> addr=4273930260, attrs=..., buf=0x7fffc8a18028 "\001", len=4) at
> /home/liqiang02/qemu-upstream/qemu/exec.c:3272
> > 8  0x558355a1 in address_space_write (as=0x5683ade0
> , addr=4273930260, attrs=..., buf=0x7fffc8a18028
> "\001", len=4) at /home/liqiang02/qemu-upstream/qemu/exec.c:3362
> > 9  0x558355f2 in address_space_rw (as=0x5683ade0
> , addr=4273930260, attrs=..., buf=0x7fffc8a18028
> "\001", len=4, is_write=true) at
> /home/liqiang02/qemu-upstream/qemu/exec.c:3373
> > 10 0x558b66ac in kvm_cpu_exec (cpu=0x63114800) at
> /home/liqiang02/qemu-upstream/qemu/accel/kvm/kvm-all.c:2031
> > 11 0x5587c3ac in qemu_kvm_cpu_thread_fn (arg=0x63114800) at
> /home/liqiang02/qemu-upstream/qemu/cpus.c:1277
> > 12 0x55e54ae6 in qemu_thread_start (args=0x6032c170) at
> util/qemu-thread-posix.c:504
> > 13 0x7fffdadbd494 in start_thread () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 14 0x7fffdaaffacf in clone () from /lib/x86_64-linux-gnu/libc.so.6
> > (gdb) q
> >
> > Signed-off-by: Li Qiang 
> > ---
> >  hw/block/nvme.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 676cc48..72c9644 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -1221,6 +1221,10 @@ static void nvme_realize(PCIDevice *pci_dev,
> Error **errp)
> >  error_setg(errp, "serial property not set");
> >  return;
> >  }
> > +
> > +if (!n->num_queues) {
> > +error_setg(errp, "num_queues can't be zero");
>
> I think we should return here.
>
>
Ok, forget this. will make a revised version later.

Thanks,
Li Qiang


> (Sorry for the late review...)
>
> Max
>
> > +}
> >  blkconf_blocksizes(&n->conf);
> >  if (!blkconf_apply_backend_options(&n->conf,
> blk_is_read_only(n->conf.blk),
> > false, errp)) {
> >
>
>
>

Re: [Qemu-devel] [PATCH v2 03/52] qapi: qapi for audio backends

On 12/23/18 2:51 PM, Kővágó, Zoltán wrote:
> This patch adds structures into qapi to replace the existing
> configuration structures used by audio backends currently. This qapi
> will be the base of the -audiodev command line parameter (that replaces
> the old environment variables based config).
> 
> This is not a 1:1 translation of the old options, I've tried to make
> them much more consistent (e.g. almost every backend had an option to
> specify buffer size, but the name was different for every backend, and
> some backends required usecs, while some other required frames, samples
> or bytes). Also tried to reduce the number of abbreviations used by the
> config keys.
> 
> Some of the more important changes:
> * use `in` and `out` instead of `ADC` and `DAC`, as the former is more
>   user friendly imho
> * moved buffer settings into the global setting area (so it's the same
>   for all backends that support it. Backends that can't change buffer
>   size will simply ignore them). Also using usecs, as it's probably more
>   user friendly than samples or bytes.
> * try-poll is now an alsa backend specific option (as all other backends
>   currently ignore it)
> 
> Signed-off-by: Kővágó, Zoltán 
> ---
>  Makefile.objs |   6 +-
>  qapi/audio.json   | 253 ++
>  qapi/qapi-schema.json |   1 +
>  3 files changed, 257 insertions(+), 3 deletions(-)
>  create mode 100644 qapi/audio.json
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index bc5b8a8442..3f833a70c0 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -1,6 +1,6 @@
> -QAPI_MODULES = block-core block char common crypto introspect job migration
> -QAPI_MODULES += misc net rdma rocker run-state sockets tpm trace transaction
> -QAPI_MODULES += ui
> +QAPI_MODULES = audio block-core block char common crypto introspect job
> +QAPI_MODULES += migration misc net rdma rocker run-state sockets tpm trace
> +QAPI_MODULES += transaction ui
>  
>  ###
>  # Common libraries for tools and emulators
> diff --git a/qapi/audio.json b/qapi/audio.json
> new file mode 100644
> index 00..56d8ce439f
> --- /dev/null
> +++ b/qapi/audio.json
> @@ -0,0 +1,253 @@
> +# -*- mode: python -*-
> +#
> +# Copyright (C) 2015 Zoltán Kővágó 

Do you want to claim 2015-2019 now?  But ultimately what you put is your
call, so don't treat my suggestion as a legal mandate.

> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or later.
> +# See the COPYING file in the top-level directory.
> +
> +##
> +# @AudiodevNoOptions:
> +#
> +# The none, coreaudio, sdl and spice audio backend have no options.
> +#
> +# Since: 3.2

4.0, now.

> +##
> +{ 'struct': 'AudiodevNoOptions',
> +  'data': { } }

Also, we now have support for empty branches in a flat union, so do you
really need this type, or...

> +
> +##
> +# @AudiodevAlsaPerDirectionOptions:
> +#
> +# Options of the alsa backend that are used for both playback and recording.
> +#
> +# @dev: #optional the name of the alsa device to use (default 'default')

No need to use the '#optional' tag any more; the doc generator now takes
care of that.

> +#
> +# @try-poll: #optional attempt to use poll mode, falling back to non polling
> +#access on failure (default on)
> +#
> +# Since: 3.2

More 4.0 (I'll quit pointing it out)


> +##
> +{ 'union': 'Audiodev',
> +  'base': {
> +'id':'str',
> +'driver':'AudiodevDriver',
> +'in':'AudiodevPerDirectionOptions',
> +'out':   'AudiodevPerDirectionOptions',
> +'*timer-period': 'int' },
> +  'discriminator': 'driver',
> +  'data': {
> +'none':  'AudiodevNoOptions',

...you could just omit the lines that don't add anything.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 1/3] qom: Don't keep error value between object_property_parse() calls

On 1/9/19 8:02 PM, Eduardo Habkost wrote:
> When handling errp==NULL at object_apply_global_props(), we are
> leaving the old error value in `err` after printing a warning.
> This makes QEMU crash if two global properties generate warnings:
> 
>   $ echo device_add rtl8139 | qemu-system-x86_64 -monitor stdio -global 
> rtl8139.xxx=yyy -global rtl8139.xxx=zzz
>   warning: can't apply global rtl8139.xxx=yyy: Property '.xxx' not found
>   qemu-system-x86_64: util/error.c:57: error_setv: Assertion `*errp == NULL' 
> failed.
>   Aborted (core dumped)
> 
> Fix that by making `err` go out of scope immediately after the
> warn_report_err() call.
> 
> Fixes: 50545b2cc029 "qdev-props: call object_apply_global_props()"
> Signed-off-by: Eduardo Habkost 
> ---
>  qom/object.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/qom/object.c b/qom/object.c
> index aa6f3a2a71..4e5226ca12 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -372,7 +372,6 @@ static void object_post_init_with_type(Object *obj, 
> TypeImpl *ti)
>  
>  void object_apply_global_props(Object *obj, const GPtrArray *props, Error 
> **errp)
>  {
> -Error *err = NULL;

Could also have been fixed by leaving this line at this scope,...

>  int i;
>  
>  if (!props) {
> @@ -381,6 +380,7 @@ void object_apply_global_props(Object *obj, const 
> GPtrArray *props, Error **errp
>  
>  for (i = 0; i < props->len; i++) {
>  GlobalProperty *p = g_ptr_array_index(props, i);
> +Error *err = NULL;
>  
>  if (object_dynamic_cast(obj, p->driver) == NULL) {
>  continue;
> 

...and doing 'err = NULL;' after warn_report_err().  That is, it's not
the going out of scope that fixes it per se, but the fact that you
changed to resetting it to NULL on each loop invocation rather than
leaving it pointing at freed memory.  Whether you set to NULL by a
tighter scope initializer or by an assignment doesn't matter, so no need
to respin since your way works.

Reviewed-by: Eric Blake 


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 2/3] globals: Allow global properties to be optional

Making some global properties optional will let us simplify
compat code when a given property works on most (but not all)
subclasses of a given type.

Device types will be able to opt out from optional compat
properties by simply not registering those properties, or by
making the property setter report an error.

Signed-off-by: Eduardo Habkost 
---
Note: the "An error is fatal for non-hotplugged devices [...]"
comment that appears in the diff scope is inaccurate, but I will
fix that in a separate patch because I don't want this to get
into the way of fixing the crash.
---
 include/hw/qdev-core.h | 3 +++
 qom/object.c   | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index bc014c1c9f..463e1ddb1e 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -250,6 +250,8 @@ struct PropertyInfo {
 /**
  * GlobalProperty:
  * @used: Set to true if property was used when initializing a device.
+ * @optional: If set to true, errors when applying the property won't be
+ *reported by object_apply_global_props().
  *
  * An error is fatal for non-hotplugged devices, when the global is applied.
  */
@@ -258,6 +260,7 @@ typedef struct GlobalProperty {
 const char *property;
 const char *value;
 bool used;
+bool optional;
 } GlobalProperty;
 
 static inline void
diff --git a/qom/object.c b/qom/object.c
index 4e5226ca12..226ddf0189 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -387,7 +387,9 @@ void object_apply_global_props(Object *obj, const GPtrArray 
*props, Error **errp
 }
 p->used = true;
 object_property_parse(obj, p->value, p->property, &err);
-if (err != NULL) {
+if (p->optional) {
+error_free(err);
+} else if (err != NULL) {
 error_prepend(&err, "can't apply global %s.%s=%s: ",
   p->driver, p->property, p->value);
 /*
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PATCH v2 3/3] virtio: Make disable-legacy/disable-modern compat properties optional

The disable-legacy and disable-modern properties apply only to
some virtio-pci devices.  Make those properties optional.

This fixes the crash introduced by commit f6e501a28ef9 ("virtio: Provide
version-specific variants of virtio PCI devices"):

  $ qemu-system-x86_64 -machine pc-i440fx-2.6 \
-device virtio-net-pci-non-transitional
  Unexpected error in object_property_find() at qom/object.c:1092:
  qemu-system-x86_64: -device virtio-net-pci-non-transitional: can't apply \
  global virtio-pci.disable-modern=on: Property '.disable-modern' not found
  Aborted (core dumped)

Reported-by: Thomas Huth 
Fixes: f6e501a28ef9 ("virtio: Provide version-specific variants of virtio PCI 
devices")
Signed-off-by: Eduardo Habkost 
---
 hw/core/machine.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 5530b71981..a19143aa44 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -91,8 +91,9 @@ const size_t hw_compat_2_7_len = G_N_ELEMENTS(hw_compat_2_7);
 
 GlobalProperty hw_compat_2_6[] = {
 { "virtio-mmio", "format_transport_address", "off" },
-{ "virtio-pci", "disable-modern", "on" },
-{ "virtio-pci", "disable-legacy", "off" },
+/* Optional because not all virtio-pci devices support legacy mode */
+{ "virtio-pci", "disable-modern", "on",  .optional = true },
+{ "virtio-pci", "disable-legacy", "off", .optional = true },
 };
 const size_t hw_compat_2_6_len = G_N_ELEMENTS(hw_compat_2_6);
 
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PATCH v2 1/3] qom: Don't keep error value between object_property_parse() calls

When handling errp==NULL at object_apply_global_props(), we are
leaving the old error value in `err` after printing a warning.
This makes QEMU crash if two global properties generate warnings:

  $ echo device_add rtl8139 | qemu-system-x86_64 -monitor stdio -global 
rtl8139.xxx=yyy -global rtl8139.xxx=zzz
  warning: can't apply global rtl8139.xxx=yyy: Property '.xxx' not found
  qemu-system-x86_64: util/error.c:57: error_setv: Assertion `*errp == NULL' 
failed.
  Aborted (core dumped)

Fix that by making `err` go out of scope immediately after the
warn_report_err() call.

Fixes: 50545b2cc029 "qdev-props: call object_apply_global_props()"
Signed-off-by: Eduardo Habkost 
---
 qom/object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qom/object.c b/qom/object.c
index aa6f3a2a71..4e5226ca12 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -372,7 +372,6 @@ static void object_post_init_with_type(Object *obj, 
TypeImpl *ti)
 
 void object_apply_global_props(Object *obj, const GPtrArray *props, Error 
**errp)
 {
-Error *err = NULL;
 int i;
 
 if (!props) {
@@ -381,6 +380,7 @@ void object_apply_global_props(Object *obj, const GPtrArray 
*props, Error **errp
 
 for (i = 0; i < props->len; i++) {
 GlobalProperty *p = g_ptr_array_index(props, i);
+Error *err = NULL;
 
 if (object_dynamic_cast(obj, p->driver) == NULL) {
 continue;
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PATCH v2 0/3] Fix virtio-*(-non)-transitional crash on 2.6 machine-types

This is a second attempt to fix the crash reported by Thomas[1].

This keeps the compat property array simple, different from my
first attempt[2].

This also avoids extra complexity on the device code: we don't
need interface name, inheritance tricks, or devices overriding a
compat property after the fact.  The simple absence of the QOM
properties on some device types is enough to make the compat code
skip them.

While working on the fix, I stumbled upon another minor bug in
object_apply_global_props(), which I fixed in patch 1/3.

This series is based on my machine-next branch, because of
conflicts with compat property array cleanups that are already
queued.

[1] a28d196a-e2fe-a013-a6e2-99ac260f6279@redhat.com">http://mid.mail-archive.com/a28d196a-e2fe-a013-a6e2-99ac260f6279@redhat.com
[2] 20190104032226.21428-1-ehabkost@redhat.com">http://mid.mail-archive.com/20190104032226.21428-1-ehabkost@redhat.com

Eduardo Habkost (3):
  qom: Don't keep error value between object_property_parse() calls
  globals: Allow global properties to be optional
  virtio: Make disable-legacy/disable-modern compat properties optional

 include/hw/qdev-core.h | 3 +++
 hw/core/machine.c  | 5 +++--
 qom/object.c   | 6 --
 3 files changed, 10 insertions(+), 4 deletions(-)

-- 
2.18.0.rc1.1.g3f1ff2140

Re: [Qemu-devel] [PATCH v2 43/52] audio: add mixeng option (documentation)

On 12/23/18 2:52 PM, Kővágó, Zoltán wrote:
> This will allow us to disable mixeng when we use a decent backend.
> 
> Disabling mixeng have a few advantages:
> * we no longer convert the audio output from one format to another, when
>   the underlying audio system would just convert it to a third format.
>   We no longer convert, only the underlying system, when needed.
> * the underlying system probably has better resampling and sample format
>   converting methods anyway...
> * we may support formats that the mixeng currently does not support (S24
>   or float samples, more than two channels)
> * when using an audio server (like pulseaudio) different sound card
>   outputs will show up as separate streams, even if we use only one
>   backend
> 
> Disadvantages:
> * audio capturing no longer works (wavcapture, and vnc audio extension)
> * some backends only support a single playback stream or very picky
>   about the audio format.  In this case we can't disable mixeng.
> 
> However mixeng is not removed, only made optional, so this shouldn't be
> a big concern.
> 
> Signed-off-by: Kővágó, Zoltán 
> ---
>  qapi/audio.json | 5 +
>  qemu-options.hx | 6 ++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/qapi/audio.json b/qapi/audio.json
> index 56d8ce439f..180bf207a8 100644
> --- a/qapi/audio.json
> +++ b/qapi/audio.json
> @@ -184,6 +184,10 @@
>  #
>  # General audio backend options that are used for both playback and 
> recording.
>  #
> +# @mixeng: #optional use QEMU's mixing engine to mix all streams inside QEMU.

The #optional tag is no longer necessary (the docs generator figures it
out from the '*mixeng' name below).

> +#  When set to off, fixed-settings must be also off.  Not every 
> backend
> +#  compatible with the off setting (default on)

Missing a '(since 4.0)' tag.

> +#
>  # @fixed-settings: #optional use fixed settings for host input/output.  When
>  #  off, frequency, channels and format must not be specified
>  #  (default on)
> @@ -207,6 +211,7 @@
>  ##
>  { 'struct': 'AudiodevPerDirectionOptions',
>'data': {
> +'*mixeng': 'bool',
>  '*fixed-settings': 'bool',
>  '*frequency':  'int',
>  '*channels':   'int',


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-09 Thread Dave Chinner

On Wed, Jan 09, 2019 at 08:17:31PM +0530, Pankaj Gupta wrote:
>  This patch series has implementation for "virtio pmem". 
>  "virtio pmem" is fake persistent memory(nvdimm) in guest 
>  which allows to bypass the guest page cache. This also
>  implements a VIRTIO based asynchronous flush mechanism.  

H. Sharing the host page cache direct into the guest VM. Sounds
like a good idea, but.

This means the guest VM can now run timing attacks to observe host
side page cache residency, and depending on the implementation I'm
guessing that the guest will be able to control host side page
cache eviction, too (e.g. via discard or hole punch operations).

Which means this functionality looks to me like a new vector for
information leakage into and out of the guest VM via guest
controlled host page cache manipulation.

https://arxiv.org/pdf/1901.01161

I might be wrong, but if I'm not we're going to have to be very
careful about how guest VMs can access and manipulate host side
resources like the page cache.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com

Re: [Qemu-devel] [PATCH] vfio: assign idstr for VFIO's mmaped regions for migration

2019-01-09 Thread Zhao Yan

On Tue, Jan 08, 2019 at 10:09:11AM -0700, Alex Williamson wrote:
> On Tue,  8 Jan 2019 01:03:48 -0500
> Zhao Yan  wrote:
> 
> > if multiple regions in vfio are mmaped, their corresponding ramblocks
> > are like below, i.e. their idstrs are "".
> > 
> > (qemu) info ramblock
> > Block Name  PSize   Offset   UsedTotal
> > pc.ram  4 KiB  0x 0x2000 0x2000
> > 4 KiB  0x2110 0x2000 0x2000
> > 4 KiB  0x2090 0x0080 0x0080
> > 4 KiB  0x2024 0x00687000 0x00687000
> > 4 KiB  0x200c 0x00178000 0x00178000
> > pc.bios 4 KiB  0x2000 0x0004 0x0004
> > pc.rom  4 KiB  0x2004 0x0002 0x0002
> > 
> > This is because ramblocks' idstr are assigned by calling
> > vmstate_register_ram(), but memory region of type ram device ptr does not
> > call vmstate_register_ram().
> > vfio_region_mmap
> > |->memory_region_init_ram_device_ptr
> >|-> memory_region_init_ram_ptr
> > 
> > Without empty idstrs will cause problem to snapshot copying during
> > migration, because it uses ramblocks' idstr to identify ramblocks.
> > ram_save_setup {
> >   …
> >   RAMBLOCK_FOREACH(block) {
> >   qemu_put_byte(f, strlen(block->idstr));
> >   qemu_put_buffer(f, (uint8_t *)block->idstr,strlen(block->idstr));
> >   qemu_put_be64(f, block->used_length);
> >   }
> >   …
> > }
> > ram_load() {
> > block = qemu_ram_block_by_name(id);
> > if (block) {
> > if (length != block->used_length) {
> > qemu_ram_resize(block, length, &local_err);
> > }
> >  ….
> >}
> > }
> > 
> > Therefore, in this patch,
> > vmstate_register_ram() is called for memory region of type ram ptr,
> > also a unique vfioid is assigned to vfio devices across source
> > and target vms.
> > e.g. in source vm, use qemu parameter
> > -device
> > vfio-pci,sysfsdev=/sys/bus/pci/devices/:00:02.0/
> > 882cc4da-dede-11e7-9180-078a62063ab1,vfioid=igd
> > 
> > and in target vm, use qemu paramter
> > -device
> > vfio-pci,sysfsdev=/sys/bus/pci/devices/:00:02.0/
> > 5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd,vfioid=igd
> 
> Why wouldn't we just use the id= (DeviceState.id) value instead of
> adding yet another one?  I can't imagine anyone, especially libvirt,
> wants to deal with a vfio specific id for a device.
>
hi Alex
You are right! DeviceState.id can be used here. Thanks for your suggestion.


> > Signed-off-by: Zhao Yan 
> > ---
> >  hw/vfio/pci.c | 8 +++-
> >  include/hw/vfio/vfio-common.h | 1 +
> >  memory.c  | 4 
> >  3 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index c0cb1ec289..7bc2ed0752 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -2533,7 +2533,12 @@ static void vfio_populate_device(VFIOPCIDevice 
> > *vdev, Error **errp)
> >  }
> >  
> >  for (i = VFIO_PCI_BAR0_REGION_INDEX; i < VFIO_PCI_ROM_REGION_INDEX; 
> > i++) {
> > -char *name = g_strdup_printf("%s BAR %d", vbasedev->name, i);
> > +char *name;
> > +if (vbasedev->vfioid) {
> > +name = g_strdup_printf("%s BAR %d", vbasedev->vfioid, i);
> > +} else {
> > +name = g_strdup_printf("%s BAR %d", vbasedev->name, i);
> > +}
> >  
> >  ret = vfio_region_setup(OBJECT(vdev), vbasedev,
> >  &vdev->bars[i].region, i, name);
> > @@ -3180,6 +3185,7 @@ static void vfio_instance_init(Object *obj)
> >  static Property vfio_pci_dev_properties[] = {
> >  DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
> >  DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev),
> > +DEFINE_PROP_STRING("vfioid", VFIOPCIDevice, vbasedev.vfioid),
> >  DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice,
> >  display, ON_OFF_AUTO_OFF),
> >  DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice,
> > diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > index 1b434d02f6..84bab94f52 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -108,6 +108,7 @@ typedef struct VFIODevice {
> >  struct VFIOGroup *group;
> >  char *sysfsdev;
> >  char *name;
> > +char *vfioid;
> >  DeviceState *dev;
> >  int fd;
> >  int type;
> > diff --git a/memory.c b/memory.c
> > index d14c6dec1d..dbb29fa989 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1588,6 +1588,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
> >  uint64_t size,
> >  void *ptr)
> >  {
> > +DeviceState *owner_dev;
> >  memory_region_init(mr, owner, name, size);
>

Re: [Qemu-devel] [SPDK] Qemu migration with vhost-user-blk on top of local storage

2019-01-09 Thread wuzhouhui

> -Original Messages-
> From: "Wodkowski, PawelX" 
> Sent Time: 2019-01-10 00:19:48 (Thursday)
> To: "stefa...@gmail.com" , "s...@lists.01.org" 
> 
> Cc: "libvir-l...@redhat.com" , "xieyon...@baidu.com" 
> , "qemu-devel@nongnu.org" , 
> "lili...@baidu.com" 
> Subject: Re: [SPDK] [Qemu-devel] Qemu migration with vhost-user-blk on top of 
> local storage
> 
> On Wed, 2019-01-09 at 21:23 +0800, wuzhouhui wrote:
> > > -Original Messages-
> > > From: "Stefan Hajnoczi" 
> > > Sent Time: 2019-01-09 20:42:58 (Wednesday)
> > > To: wuzhouhui 
> > > Cc: qemu-devel@nongnu.org, xieyon...@baidu.com, lili...@baidu.com, 
> > > libvir-l...@redhat.com, s...@lists.01.org
> > > Subject: Re: [Qemu-devel] Qemu migration with vhost-user-blk on top
> > > of local storage
> > > 
> > > On Wed, Jan 09, 2019 at 06:23:42PM +0800, wuzhouhui wrote:
> > > > Hi everyone,
> > > > 
> > > > I'm working qemu with vhost target (e.g. spdk), and I attempt to
> > > > migrate VM with
> > > > 2 local storages. One local storage is a regular file, e.g.
> > > > /tmp/c74.qcow2, and
> > > > the other is a malloc bdev that spdk created. This malloc bdev
> > > > will exported to
> > > > VM via vhost-user-blk. When I execute following command:
> > > > 
> > > >   virsh migrate --live --persistent --unsafe --undefinesource --
> > > > copy-storage-all \
> > > >  --p2p --auto-converge --verbose --desturi
> > > > qemu+tcp:///system vm0
> > > > 
> > > > The libvirt reports:
> > > > 
> > > >   qemu-2.12.1: error: internal error: unable to execute QEMU
> > > > command \
> > > > 'nbd-server-add': Cannot find device=drive-virtio-disk1 nor \
> > > > node_name=drive-virtio-disk1
> > > 
> > > Please post your libvirt domain XML.
> > 
> > My libvirt is based on libvirt-1.1.1-29.el7, and add many patches to
> > satisfy our
> > own needs, e.g. add support for vhost-user-blk. Post domain xml may
> > not useful.
> > Anyway, following is full contents of XML:
> > 
> >   
> > wzh
> > a84e96e6-2c53-408d-986b-c709bc6a0e51
> > 4096
> > 
> >   
> > 
> > 4096
> > 2
> > 
> >   hvm
> >   
> > 
> > 
> >   
> > 
> > 
> >   
> > 
> > destroy
> > restart
> > destroy
> > 
> >   /data/wzh/x86_64-softmmu/qemu-system-
> > x86_64
> >   
> > 
> > 
> > 
> > 
> >   
> > 
> >   
> > 
> > 
> > 
> > 
> >   
> > 
> >   
> > 
> >   
> >   
> > 
> >   
> >   
> > 
> > 
> >   
> >   
> > 
> > 
> >   
> >   
> > 
> >   
> >   
> >> keymap='en-us'>
> > 
> >   
> >   
> > 
> > 
> >   
> > 
> > 
> >   
> > 
> > > 
> > > > Does it means that qemu with spdk on top of local storage don't
> > > > support migration?
> > > > 
> > > > QEMU: 2.12.1
> > > > SPDK: 18.10
> > > 
> > > vhost-user-blk bypasses the QEMU block layer, so NBD storage
> > > migration
> > > at the QEMU level will not work for the vhost-user-blk disk.
> > > 
> > > Stefan
> 
> 
> Don't know if this is the case that wuzhouhui is using but generally

My case is different from yours. All storages (including qemu image and malloc
bdev) are local, and only accessible from one of hosts. Anyway, thanks for
everyone's response.

wuzhouhui

> migration should work if all storages are accesible from both machines.
> The QEMU image and Malloc bdev should be exposed using some kind of
> network sharing. SPDK on migration target should be configured the same
> way as on source machine.
> 
> We are testing this issuing QEMU monitor commands so I can't help with
> libvirt here.
> 
> Example setup/test case we are using in one of our tests:
> 
> Machine1:
>   Expose VM image over SSHF
>   Expose some block device using SPDK nvmf target
>   SPDK nvmf initiator:
>  - connects to Machine1 SPDK nvmf target to access block device
>   SPDK vhost-scsi:
> - presents block device to QEMU from SPDK nvmf initiator:
>   QEMU VM uses
> - shared VM image
> - block device from SPDK nvmf  initiator over vhost-user-scsi
> 
> Machine2:  
>   SPDK nvmf initiator:
>  - connects to Machine1 SPDK nvmf target to access block device
>   SPDK vhost-scsi:
> - presents block device to QEMU from nvmf
> initiator:
>   QEMU instance - waiting for incoming migration
> - shared VM image over SSHFS
> - block device from SPDK nvmf  initiator over vhost-user-scsi
> 
> Some traffic is generated using FIO and then we just migrate VM from
> Machine1 to Machine2.
> 
> Pawel
> 
> > 
> > ___
> > SPDK mailing list
> > s...@lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> ___
> SPDK mailing list
> s...@lists.01.org
> https://lists.01.org/mailman/listinfo/spdk

Re: [Qemu-devel] [PATCH 3/3] machine: Use shorter format for GlobalProperty arrays

On Tue, Jan 08, 2019 at 11:20:12AM +0100, Cornelia Huck wrote:
> On Tue, 8 Jan 2019 07:45:43 +0100
> Gerd Hoffmann  wrote:
> 
> >   Hi,
> > 
> > > +{ "migration", "decompress-error-check", "off" },
> > > +{ "hda-audio", "use-timer", "false" },
> > > +{ "cirrus-vga", "global-vmstate", "true" },
> > > +{ "VGA", "global-vmstate", "true" },
> > > +{ "vmware-svga", "global-vmstate", "true" },
> > > +{ "qxl-vga", "global-vmstate", "true" },  
> > 
> > I'd like to have the fields aligned.  Especially in cases like this one
> > where multiple devices get the same value assigned it makes things more
> > readable:
> > 
> > { "migration",   "decompress-error-check", "off"   },
> > { "hda-audio",   "use-timer",  "false" },
> > { "cirrus-vga",  "global-vmstate", "true"  },
> > { "VGA", "global-vmstate", "true"  },
> > { "vmware-svga", "global-vmstate", "true"  },
> > { "qxl-vga", "global-vmstate", "true"  },
> > 
> > thanks,
> >   Gerd
> > 
> 
> I'm a bit on the fence here. It does make things more readable (at
> least in your example), but I find editing aligned tables a bit
> annoying. OTOH, that won't happen often, anyway.

I'm unsure, too.  Also, not merging this series is increasing the
likelihood of conflicts with other patches.  I'm queueing this
version, and we can discuss alignment alternatives later.

(I'm less worried about conflicts caused by future alignment
patches because alignment conflicts are easier to sort out than
redoing the .driver/.property/.value conversion).

-- 
Eduardo

Re: [Qemu-devel] [PATCH v2 06/52] audio: -audiodev command line option basic implementation

2019-01-09 Thread Zoltán Kővágó

On 2019-01-08 04:42, Markus Armbruster wrote:
> "Zoltán Kővágó"  writes:
> 
>> On 2019-01-07 14:13, Markus Armbruster wrote:
>>> "Kővágó, Zoltán"  writes:
>>>
 Audio drivers now get an Audiodev * as config paramters, instead of the
 global audio_option structs.  There is some code in audio/audio_legacy.c
 that converts the old environment variables to audiodev options (this
 way backends do not have to worry about legacy options).  It also
 contains a replacement of -audio-help, which prints out the equivalent
 -audiodev based config of the currently specified environment variables.

 Note that backends are not updated and still rely on environment
 variables.

 Also note that (due to moving try-poll from global to backend specific
 option) currently ALSA and OSS will always try poll mode, regardless of
 environment variables or -audiodev options.

 Signed-off-by: Kővágó, Zoltán 
 ---
>>> [...]
 diff --git a/audio/audio.c b/audio/audio.c
 index 96cbd57c37..e7f25ea84b 100644
 --- a/audio/audio.c
 +++ b/audio/audio.c
>>> [...]
 @@ -2127,3 +1841,158 @@ void AUD_set_volume_in (SWVoiceIn *sw, int mute, 
 uint8_t lvol, uint8_t rvol)
  }
  }
  }
 +
 +QemuOptsList qemu_audiodev_opts = {
 +.name = "audiodev",
 +.head = QTAILQ_HEAD_INITIALIZER(qemu_audiodev_opts.head),
 +.implied_opt_name = "driver",
 +.desc = {
 +/*
 + * no elements => accept any params
 + * sanity checking will happen later
 + */
 +{ /* end of list */ }
 +},
 +};
 +
 +static void validate_per_direction_opts(AudiodevPerDirectionOptions *pdo,
 +Error **errp)
 +{
 +if (!pdo->has_fixed_settings) {
 +pdo->has_fixed_settings = true;
 +pdo->fixed_settings = true;
 +}
 +if (!pdo->fixed_settings &&
 +(pdo->has_frequency || pdo->has_channels || pdo->has_format)) {
 +error_setg(errp,
 +   "You can't use frequency, channels or format with 
 fixed-settings=off");
 +return;
 +}
 +
 +if (!pdo->has_frequency) {
 +pdo->has_frequency = true;
 +pdo->frequency = 44100;
 +}
 +if (!pdo->has_channels) {
 +pdo->has_channels = true;
 +pdo->channels = 2;
 +}
 +if (!pdo->has_voices) {
 +pdo->has_voices = true;
 +pdo->voices = 1;
 +}
 +if (!pdo->has_format) {
 +pdo->has_format = true;
 +pdo->format = AUDIO_FORMAT_S16;
 +}
 +}
 +
 +static Audiodev *parse_option(QemuOpts *opts, Error **errp)
 +{
 +Error *local_err = NULL;
 +Visitor *v = opts_visitor_new(opts, true);
 +Audiodev *dev = NULL;
 +visit_type_Audiodev(v, NULL, &dev, &local_err);
 +visit_free(v);
 +
 +if (local_err) {
 +goto err2;
 +}
 +
 +validate_per_direction_opts(dev->in, &local_err);
 +if (local_err) {
 +goto err;
 +}
 +validate_per_direction_opts(dev->out, &local_err);
 +if (local_err) {
 +goto err;
 +}
 +
 +if (!dev->has_timer_period) {
 +dev->has_timer_period = true;
 +dev->timer_period = 1; /* 100Hz -> 10ms */
 +}
 +
 +return dev;
 +
 +err:
 +qapi_free_Audiodev(dev);
 +err2:
 +error_propagate(errp, local_err);
 +return NULL;
 +}
 +
 +static int each_option(void *opaque, QemuOpts *opts, Error **errp)
 +{
 +Audiodev *dev = parse_option(opts, errp);
 +if (!dev) {
 +return -1;
 +}
 +return audio_init(dev);
 +}
 +
 +void audio_set_options(void)
 +{
 +if (qemu_opts_foreach(qemu_find_opts("audiodev"), each_option, NULL,
 +  &error_abort)) {
 +exit(1);
 +}
 +}
>>> [...]
 diff --git a/vl.c b/vl.c
 index 8353d3c718..b5364ffe46 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -3074,6 +3074,7 @@ int main(int argc, char **argv, char **envp)
  qemu_add_opts(&qemu_option_rom_opts);
  qemu_add_opts(&qemu_machine_opts);
  qemu_add_opts(&qemu_accel_opts);
 +qemu_add_opts(&qemu_audiodev_opts);
  qemu_add_opts(&qemu_mem_opts);
  qemu_add_opts(&qemu_smp_opts);
  qemu_add_opts(&qemu_boot_opts);
 @@ -3307,9 +3308,15 @@ int main(int argc, char **argv, char **envp)
  add_device_config(DEV_BT, optarg);
  break;
  case QEMU_OPTION_audio_help:
 -AUD_help ();
 +audio_legacy_help();
  exit (0);
  break;

[Qemu-devel] [PATCH v1 1/1] default-configs: Enable USB support for RISC-V machines

2019-01-09 Thread Alistair Francis

Signed-off-by: Alistair Francis 
---
 default-configs/riscv32-softmmu.mak | 1 +
 default-configs/riscv64-softmmu.mak | 1 +
 2 files changed, 2 insertions(+)

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index dbc9398284..c9c5971409 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -1,6 +1,7 @@
 # Default configuration for riscv-softmmu
 
 include pci.mak
+include usb.mak
 
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index dbc9398284..c9c5971409 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -1,6 +1,7 @@
 # Default configuration for riscv-softmmu
 
 include pci.mak
+include usb.mak
 
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
-- 
2.19.1

Re: [Qemu-devel] [PULL 00/29] ppc-for-4.0 queue 20190109

2019-01-09 Thread Peter Maydell

On Tue, 8 Jan 2019 at 22:46, David Gibson  wrote:
>
> The following changes since commit 147923b1a901a0370f83a0f4c58ec1baffef22f0:
>
>   Merge remote-tracking branch 
> 'remotes/kraxel/tags/usb-20190108-pull-request' into staging (2019-01-08 
> 16:07:32 +)
>
> are available in the Git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-4.0-20190109
>
> for you to fetch changes up to 3a8eb78e6c135422017888380db091793039b6dd:
>
>   spapr: enable XIVE MMIOs at reset (2019-01-09 09:28:14 +1100)
>
> 
> ppc patch queue 2019-01-09
>
> Second main pull request for qemu-4.0.  Highlights are:
>  * Final parts of XIVE support for pseries (without KVM)
>  * Preliminary work for PHB hotplug
>  * Starting to use TCG vector operations
>
> This includes some changes in the PCI core, which Michael Tsirkin
> requested come through this tree, since they're primarily of interest
> for ppc.
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM

Re: [Qemu-devel] [PATCH QEMU v5] hw/arm/sysbus-fdt: Add support for instantiating generic devices

2019-01-09 Thread Peter Maydell

So I should start out upfront by saying that I'm aware that
the reality is that people do want to do passthrough with
this kind of hardware, and that's not an unreasonable
thing to do. I just don't really like the way that pushes
the software into having to do ugly things...

Overall I'll let Eric call the shots about how well this
feature fits into our sysbus-fdt support; he knows this
code much better than I do. (I would have preferred us
not to have sysbus-fdt passthrough at all, in an
ideal world :-))

On Wed, 9 Jan 2019 at 16:30, Geert Uytterhoeven  wrote:
> On Wed, Jan 9, 2019 at 5:03 PM Peter Maydell  wrote:
> > whitelists for the devices we want to allow passthrough of and
> > that we've tested to work.
>
> In the end, this will become a loong list (SoC x devices)...

Well, yes, but it deters people from trying to do passthrough
with hardware that's not designed in a way that makes
passthrough reasonably supportable at a software level.

> Reality is that on embedded, on-SoC devices are usually not on a PCI bus.
> But IOMMUs are present, and virtualization is wanted.

I don't insist on PCI. Probeable, and consistent in
terms of what they provide and how you interact with
them, is what we want. Embedded on-SoC devices are
generally neither. (The host kernel could in theory
provide an API that exposed only devices that were safely
pass-through-able in a consistent way, I suppose, but it
doesn't, or we wouldn't be having to fish through the
device tree nodes making guesses about what's safe to
allow and what isn't and having heuristics about not
providing clocks info being ok if we think the clock
might only be used for power management...)

thanks
-- PMM

[Qemu-devel] [PATCH] vfio/common: Work around kernel overflow bug in DMA unmap

2019-01-09 Thread Alex Williamson

A kernel bug was introduced in v4.15 via commit 71a7d3d78e3c which
adds a test for address space wrap-around in the vfio DMA unmap path.
Unfortunately due to overflow, the kernel detects an unmap of the last
page in the 64-bit address space as a wrap-around.  In QEMU, a Q35
guest with VT-d emulation and guest IOMMU enabled will attempt to make
such an unmap request during VM system reset, triggering an error:

  qemu-kvm: VFIO_UNMAP_DMA: -22
  qemu-kvm: vfio_dma_unmap(0x561f059948f0, 0xfef0, 0x0110) = 
-22 (Invalid argument)

Here the IOVA start address (0xfef0) and the size parameter
(0x0110) add to exactly 2^64, triggering the bug.  A
kernel fix is queued for the Linux v5.0 release to address this.

This patch implements a workaround to retry the unmap, excluding the
final page of the range when we detect an unmap failing which matches
the requirements for this issue.  This is expected to be a safe and
complete workaround as the VT-d address space does not extend to the
full 64-bit space and therefore the last page should never be mapped.

This workaround can be removed once all kernels with this bug are
sufficiently deprecated.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
Reported-by: Pei Zhang 
Debugged-by: Peter Xu 
Signed-off-by: Alex Williamson 
---
 hw/vfio/common.c |   20 +++-
 hw/vfio/trace-events |1 +
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7c185e5a2e79..820b839057c6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -220,7 +220,25 @@ static int vfio_dma_unmap(VFIOContainer *container,
 .size = size,
 };
 
-if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
+while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
+/*
+ * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c
+ * v4.15) where an overflow in its wrap-around check prevents us from
+ * unmapping the last page of the address space.  Test for the error
+ * condition and re-try the unmap excluding the last page.  The
+ * expectation is that we've never mapped the last page anyway and this
+ * unmap request comes via vIOMMU support which also makes it unlikely
+ * that this page is used.  This bug was introduced well after type1 v2
+ * support was introduced, so we shouldn't need to test for v1.  A fix
+ * is queued for kernel v5.0 so this workaround can be removed once
+ * affected kernels are sufficiently deprecated.
+ */
+if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) &&
+container->iommu_type == VFIO_TYPE1v2_IOMMU) {
+trace_vfio_dma_unmap_overflow_workaround();
+unmap.size -= 1ULL << ctz64(container->pgsizes);
+continue;
+}
 error_report("VFIO_UNMAP_DMA: %d", -errno);
 return -errno;
 }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index a85e8662eadb..a002c6af2dda 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -110,6 +110,7 @@ vfio_region_mmaps_set_enabled(const char *name, bool 
enabled) "Region %s mmaps e
 vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) 
"Device %s region %d: %d sparse mmap entries"
 vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) 
"sparse entry %d [0x%lx - 0x%lx]"
 vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t 
subtype) "%s index %d, %08x/%0x8"
+vfio_dma_unmap_overflow_workaround(void) ""
 
 # hw/vfio/platform.c
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group 
#%d"

Re: [Qemu-devel] [PATCH] target/arm: Allow to switch from MON->HYP on AArch32

2019-01-09 Thread Peter Maydell

On Wed, 9 Jan 2019 at 17:14, Alexander Graf  wrote:
>
> On 01/09/2019 05:59 PM, Peter Maydell wrote:
> > On Wed, 9 Jan 2019 at 16:52, Peter Maydell  wrote:
> >> On Wed, 9 Jan 2019 at 15:26, Alexander Graf  wrote:
> >>> In U-boot, we switch from S-SVC -> MON -> HYP when we want to enter
> >>> HYP mode. This dance seems to work ok (hence it's there in the code
> >>> base), but breaks with current QEMU.
> > PS: it would be helpful if the commit message said how u-boot
> > is trying to go from Mon to Hyp -- some ways to try to do
> > this are OK, and some are not, so whether it's supposed to
> > work or not depends on what u-boot is actually doing...
>
> I don't fully understand all of it to be honest :). But the code is here:
>
> http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/cpu/armv7/nonsec_virt.S
>
> What I managed to understand so far is that it goes to MON using the smc
> #0 call and then changes SPSR so that on return (movs pc) the mode will
> be different.

Thanks -- yes, that's an exception return so it's the
expected way to go from Mon to Hyp.

-- PMM

Re: [Qemu-devel] Emulation of TCG OPAL self-encrypting drive

2019-01-09 Thread David Kozub


On Mon, 7 Jan 2019, Stefan Hajnoczi wrote:


QEMU supports LUKS encrypted disk images so no new code is needed for
the actual encryption.


Thanks for the feedback, Stefan. I know very little about qemu internals 
(I looked around a bit). One issue is: OPAL needs some persistent data 
outside of the actual user-visible data. How does that fit in with storage 
in QEMU? Perhaps the implementation could just occupy a fixed size of the 
associated storage for the OPAL state.



Or, just a pass-through to a block device in the host - but a pass-through
that would allow OPAL commands.


You can pass through a storage controller using PCI passthrough or you
can pass through a SCSI LUN, but there is no ATA passthrough.


I currently don't have a usable box for PCI passthrough. I'm thinking 
that ATA passthrough could be generally usable for any fiddling and 
perhaps not too difficult to implement.


If I understand QEMU sources correctly, this needs to touch hw/ide/core.c 
(ide_exec_cmd), either adding a layer for OPAL, or just forwarding ATA 
commands for pass-through. Right?


Best regards,
David

[Qemu-devel] [PATCH] ppc440: Avoid reporting error when reading non-existent RAM slot

When reading base register of RAM slot with no RAM we should not try
to calculate register value because that will result printing an error
due to invalid RAM size. Just return 0 without the error in this case.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index c489368905..9130eb314c 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -613,8 +613,10 @@ static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 case SDRAM_R1BAS:
 case SDRAM_R2BAS:
 case SDRAM_R3BAS:
-ret = sdram_bcr(sdram->ram_bases[dcrn - SDRAM_R0BAS],
-sdram->ram_sizes[dcrn - SDRAM_R0BAS]);
+if (sdram->ram_sizes[dcrn - SDRAM_R0BAS]) {
+ret = sdram_bcr(sdram->ram_bases[dcrn - SDRAM_R0BAS],
+sdram->ram_sizes[dcrn - SDRAM_R0BAS]);
+}
 break;
 case SDRAM_CONF1HB:
 case SDRAM_CONF1LL:
-- 
2.13.7

Re: [Qemu-devel] [PATCH v1 0/3] gitdm updates

On Tuesday, January 8, 2019, Aleksandar Markovic <
aleksandar.m.m...@gmail.com> wrote:

>
>
> On Monday, January 7, 2019, Alex Bennée  wrote:
>
>>
>> Hi,
>>
>> Added a few more updates mostly of IBMers with non corporate emails.
>
>
> Hi, Alex, could you please add
>
> leon.al...@imgtec.com
>
> to Wave Computing group?
>
> (my email client automatically inserts and displays "ae" as one letter,
> the last name should be ending with a e, a-l-r-a-e.)
>
> I have some patches from Leon that will be upstreamed soon, with his email
> as above.
>
> I am on days-off and it is difficult for me to create and send patches, so
> I ask you toake this minor correction.
>
> Thanks, Aleksandar
>
>
>
>> The year-end stats as per:
>>
>>  git log --numstat --after="1/1/2018 00:00" \
>>--before="31/12/2018 23:59" | gitdm -n -l 10
>>
>> Are:
>>
>>   Top changeset contributors by employer
>>   Red Hat   3091 (43.3%)
>>   Linaro1201 (16.8%)
>>   (None) 484 (6.8%)
>>   IBM426 (6.0%)
>>   Academics (various)186 (2.6%)
>>   Virtuozzo  172 (2.4%)
>>   Wave Computing 118 (1.7%)
>>   Igalia 109 (1.5%)
>>   Xilinx 102 (1.4%)
>>   Cadence Design Systems  80 (1.1%)
>>
>>   Top lines changed by employer
>>   Red Hat   140523 (30.3%)
>>   Cadence Design Systems81010 (17.5%)
>>   Linaro78098 (16.8%)
>>   Wave Computing33134 (7.1%)
>>   IBM   18918 (4.1%)
>>   SiFive14436 (3.1%)
>>   Academics (various)   11995 (2.6%)
>>   (None)11458 (2.5%)
>>   Virtuozzo 10770 (2.3%)
>>   Oracle6698 (1.4%)
>>
>> Alex Bennée (2):
>>   contrib/gitdm: add Nokia and Proxmox to the domain-map
>>   contrib/gitdm: add two more IBM'ers to group-map-ibm
>>
>> Joel Stanley (1):
>>   contrib/gitdm: Add other IBMers
>>
>>  contrib/gitdm/domain-map| 2 ++
>>  contrib/gitdm/group-map-ibm | 7 +++
>>  2 files changed, 9 insertions(+)
>>
>> --
>> 2.17.1
>>
>>
>>
If you create a patch for Wave Computing as I described it also has:

 Reviewed-by: Aleksandar Markovic

Re: [Qemu-devel] [PATCH v1 1/3] contrib/gitdm: add Nokia and Proxmox to the domain-map

On Monday, January 7, 2019, Alex Bennée  wrote:

> Signed-off-by: Alex Bennée 
> ---
>  contrib/gitdm/domain-map | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/contrib/gitdm/domain-map b/contrib/gitdm/domain-map
> index 8cbbcfe93d..0ab41ee27a 100644
> --- a/contrib/gitdm/domain-map
> +++ b/contrib/gitdm/domain-map
> @@ -9,7 +9,9 @@ greensocs.com   GreenSocs
>  ibm.com IBM
>  igalia.com  Igalia
>  linaro.org  Linaro
> +nokia.com   Nokia
>  oracle.com  Oracle
> +proxmox.com Proxmox
>  redhat.com  Red Hat
>  siemens.com Siemens
>  sifive.com  SiFive
> --
> 2.17.1
>
>
>
 Reviewed-by: Aleksandar Markovic

Re: [Qemu-devel] [PATCH v1 2/3] contrib/gitdm: Add other IBMers

On Monday, January 7, 2019, Alex Bennée  wrote:

> From: Joel Stanley 
>
> Here are some IBMers who use their personal addresses when submitting
> patches.
>
> Signed-off-by: Joel Stanley 
> Acked-by: Andrew Jeffery 
> Signed-off-by: Alex Bennée 
> ---
>  contrib/gitdm/group-map-ibm | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/contrib/gitdm/group-map-ibm b/contrib/gitdm/group-map-ibm
> index b66db5f4a8..6c0570107d 100644
> --- a/contrib/gitdm/group-map-ibm
> +++ b/contrib/gitdm/group-map-ibm
> @@ -2,5 +2,10 @@
>  # Some IBM contributors submit via another domain
>  #
>
> +a...@ozlabs.ru
> +and...@aj.id.au
> +b...@kernel.crashing.org
>  c...@kaod.org
>  gr...@kaod.org
> +j...@jms.id.au
> +sjitindarsi...@gmail.com
> --
> 2.17.1
>
>
 Reviewed-by: Aleksandar Markovic

Re: [Qemu-devel] [PATCH v1 3/3] contrib/gitdm: add two more IBM'ers to group-map-ibm

On Monday, January 7, 2019, Alex Bennée  wrote:

> Signed-off-by: Alex Bennée 
> ---
>  contrib/gitdm/group-map-ibm | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/contrib/gitdm/group-map-ibm b/contrib/gitdm/group-map-ibm
> index 6c0570107d..22727319b3 100644
> --- a/contrib/gitdm/group-map-ibm
> +++ b/contrib/gitdm/group-map-ibm
> @@ -6,6 +6,8 @@ a...@ozlabs.ru
>  and...@aj.id.au
>  b...@kernel.crashing.org
>  c...@kaod.org
> +danielhb...@gmail.com
>  gr...@kaod.org
> +jcfara...@gmail.com
>  j...@jms.id.au
>  sjitindarsi...@gmail.com
> --
> 2.17.1
>
>
>
 Reviewed-by: Aleksandar Markovic

[Qemu-devel] [PULL v6 00/35] Misc patches for 2018-12-18

The following changes since commit c102d9471f8f02d9fbea72ec4505d7089173f470:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190107' 
into staging (2019-01-07 16:56:33 +)

are available in the Git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 5eddc84292736dc57db341f9724395a29d3614f7:

  avoid TABs in files that only contain a few (2019-01-09 23:01:14 +0100)


* HAX support for Linux hosts (Alejandro)
* esp bugfixes (Guenter)
* Windows build cleanup (Marc-André)
* checkpatch logic improvements (Paolo)
* coalesced range bugfix (Paolo)
* switch testsuite to TAP (Paolo)
* QTAILQ rewrite (Paolo)
* block/iscsi.c cancellation fixes (Stefan)
* improve selection of the default accelerator (Thomas)


Alexandro Sanchez Bach (1):
  hax: Support for Linux hosts

Guenter Roeck (2):
  esp-pci: Fix status register write erase control
  scsi: esp: Defer command completion until previous interrupts have been 
handled

Marc-André Lureau (4):
  build-sys: don't include windows.h, osdep.h does it
  build-sys: move windows defines in osdep.h header
  build-sys: build with Vista API by default
  qga: drop < Vista compatibility

Paolo Bonzini (21):
  checkpatch: fix premature exit when no input or --mailback
  checkpatch: check Signed-off-by in --mailback mode
  checkpatch: improve handling of multiple patches or files
  checkpatch: colorize output to terminal
  pam: wrap MemoryRegion initialization in a transaction
  memory: extract flat_range_coalesced_io_{del,add}
  memory: avoid unnecessary coalesced_io_del operations
  memory: update coalesced_range on transaction_commit
  test: execute g_test_run when tests are skipped
  test: replace gtester with a TAP driver
  qemu/queue.h: do not access tqe_prev directly
  vfio: make vfio_address_spaces static
  qemu/queue.h: leave head structs anonymous unless necessary
  qemu/queue.h: typedef QTAILQ heads
  qemu/queue.h: remove Q_TAILQ_{HEAD,ENTRY}
  qemu/queue.h: reimplement QTAILQ without pointer-to-pointers
  qemu/queue.h: simplify reverse access to QTAILQ
  checkpatch: warn about qemu/queue.h head structs that are not typedef-ed
  scripts: add script to convert multiline comments into 4-line format
  remove space-tab sequences
  avoid TABs in files that only contain a few

Peng Hao (1):
  hw/watchdog/wdt_i6300esb: remove a unnecessary comment

Stefan Hajnoczi (4):
  block/iscsi: drop unused IscsiAIOCB->buf field
  block/iscsi: take iscsilun->mutex in iscsi_timed_check_events()
  block/iscsi: fix ioctl cancel use-after-free
  block/iscsi: cancel libiscsi task when ABORT TASK TMF completes

Thomas Huth (1):
  accel: Improve selection of the default accelerator

 accel/accel.c|  18 +-
 accel/kvm/kvm-all.c  |   4 +-
 accel/tcg/translate-all.c|   4 -
 block/bochs.c|  22 +-
 block/file-posix.c   |   2 +-
 block/file-win32.c   |   8 +-
 block/gluster.c  |   2 +-
 block/iscsi.c|  47 +++-
 block/linux-aio.c|   4 +-
 block/mirror.c   |   2 +-
 block/qcow2-bitmap.c |   4 +-
 block/qcow2-cluster.c|   2 +-
 block/qcow2.h|   5 +-
 block/sheepdog.c |   6 +-
 block/vhdx.h |   2 +-
 block/vpc.c  |   2 +-
 blockdev.c   |   4 +-
 bsd-user/elfload.c   |   2 +-
 bsd-user/x86_64/target_syscall.h |   2 +-
 configure|   3 -
 contrib/elf2dmp/main.c   |   2 +-
 contrib/ivshmem-client/ivshmem-client.h  |   4 +-
 contrib/ivshmem-server/ivshmem-server.h  |   5 +-
 cpus-common.c|   2 +-
 crypto/aes.c |  28 +-
 disas/alpha.c|   8 +-
 disas/arm.c  |   2 +-
 disas/i386.c |   4 +-
 disas/m68k.c |   4 +-
 dump.c   |   2 +-
 exec.c   |   5 +-
 fsdev/qemu-fsdev.c   |   2 +-
 hw/alpha/typhoon.c   |  12 +-
 hw/arm/stellaris.c   |   2 +-
 hw/block/nvme.h  |   8 +-
 hw/block/xen_disk.c  |   6 +-
 hw/char/sh_serial.c

[Qemu-devel] [PULL 23/35] test: replace gtester with a TAP driver

gtester is deprecated by upstream glib (see for example the announcement
at https://blog.gtk.org/2018/07/11/news-from-glib-2-58/) and it does
not support tests that call g_test_skip in some glib stable releases.

glib suggests instead using Automake's TAP support, which gtest itself
supports since version 2.38 (QEMU's minimum requirement is 2.40).
We do not support Automake, but we can use Automake's code to beautify
the TAP output.  I chose to use the Perl copy rather than the shell/awk
one, with some changes so that it can accept TAP through stdin, in order
to reuse Perl's TAP parsing package.  This also avoids duplicating the
parser between tap-driver.pl and tap-merge.pl.

Signed-off-by: Paolo Bonzini 
Message-Id: <1543513531-1151-3-git-send-email-pbonz...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Paolo Bonzini 
---
 rules.mak|   4 +-
 scripts/gtester-cat  |  26 --
 scripts/tap-driver.pl| 378 +++
 scripts/tap-merge.pl | 110 ++
 tests/Makefile.include   |  75 ++--
 tests/docker/dockerfiles/centos7.docker  |   1 +
 tests/docker/dockerfiles/debian-amd64.docker |   1 +
 tests/docker/dockerfiles/debian-ports.docker |   1 +
 tests/docker/dockerfiles/debian-sid.docker   |   1 +
 tests/docker/dockerfiles/debian8.docker  |   1 +
 tests/docker/dockerfiles/debian9.docker  |   1 +
 tests/docker/dockerfiles/fedora.docker   |   1 +
 tests/docker/dockerfiles/ubuntu.docker   |   1 +
 13 files changed, 549 insertions(+), 52 deletions(-)
 delete mode 100755 scripts/gtester-cat
 create mode 100755 scripts/tap-driver.pl
 create mode 100755 scripts/tap-merge.pl

diff --git a/rules.mak b/rules.mak
index bbb2667928..86e033d815 100644
--- a/rules.mak
+++ b/rules.mak
@@ -132,7 +132,9 @@ modules:
 #  otherwise print the 'quiet' output in the format "  NAME args to print"
 # NAME should be a short name of the command, 7 letters or fewer.
 # If called with only a single argument, will print nothing in quiet mode.
-quiet-command = $(if $(V),$1,$(if $(2),@printf "  %-7s %s\n" $2 $3 && $1, @$1))
+quiet-command-run = $(if $(V),,$(if $2,printf "  %-7s %s\n" $2 $3 && ))$1
+quiet-@ = $(if $(V),,@)
+quiet-command = $(quiet-@)$(call quiet-command-run,$1,$2,$3)
 
 # cc-option
 # Usage: CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0)
diff --git a/scripts/gtester-cat b/scripts/gtester-cat
deleted file mode 100755
index 061a952cad..00
--- a/scripts/gtester-cat
+++ /dev/null
@@ -1,26 +0,0 @@
-#!/bin/sh
-#
-# Copyright IBM, Corp. 2012
-#
-# Authors:
-#  Anthony Liguori 
-#
-# This work is licensed under the terms of the GNU GPLv2 or later.
-# See the COPYING file in the top-level directory.
-
-cat <
-
- 
-  qemu
-  0.0
-  rev
- 
-EOF
-
-sed \
-  -e '/$/d' \
-  -e '//,/<\/info>/d' \
-  -e '$b' \
-  -e '/^<\/gtester>$/d' "$@"
diff --git a/scripts/tap-driver.pl b/scripts/tap-driver.pl
new file mode 100755
index 00..5e59b5db49
--- /dev/null
+++ b/scripts/tap-driver.pl
@@ -0,0 +1,378 @@
+#! /usr/bin/env perl
+# Copyright (C) 2011-2013 Free Software Foundation, Inc.
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2, or (at your option)
+# any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+
+# As a special exception to the GNU General Public License, if you
+# distribute this file as part of a program that contains a
+# configuration script generated by Autoconf, you may include it under
+# the same distribution terms that you use for the rest of that program.
+
+# -- #
+#  Imports, static data, and setup.  #
+# -- #
+
+use warnings FATAL => 'all';
+use strict;
+use Getopt::Long ();
+use TAP::Parser;
+use Term::ANSIColor qw(:constants);
+
+my $ME = "tap-driver.pl";
+my $VERSION = "2018-11-30";
+
+my $USAGE = <<'END';
+Usage:
+  tap-driver [--test-name=TEST] [--color={always|never|auto}]
+ [--verbose] [--show-failures-only]
+END
+
+my $HELP = "$ME: TAP-aware test driver for QEMU testsuite harness." .
+   "\n" . $USAGE;
+
+# It's important that NO_PLAN evaluates "false" as a boolean.
+use constant NO_PLAN => 0;
+use constant EARLY_PLAN => 1;
+use constant LATE_PLAN => 2;
+
+use constant DIAG_STRING => "#";
+
+# --- #
+#  Global variables.  #
+# --- #
+
+my $testno = 0; # Number of test

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

On 09/01/19 22:25, Eric Blake wrote:
> On 1/9/19 3:20 PM, Paolo Bonzini wrote:
>> On 09/01/19 18:28, Daniel P. Berrangé wrote:
 so both files include each other, how nice ...
>>> If the header files are mutually dependent it makes me wonder what the
>>> point of having them split up is ?
>>>
>>> Feels like either they need to be merged, or they need to be split up
>>> and refactored even more to remove the mutual dependancy.
>>
>> If they include each other only for the typedefs, then prehaps the
>> solution is to change the coding style and allow using struct in
>> function prototypes.  I'm pretty sure there are several examples of this
>> already.
> 
> Or stick the typedef in , instead of trying to find (or
> create) some other common header.

Putting typedefs for a very specific machine or device in typedefs.h
doesn't seem too clean.  Usually typedefs.h is used when things are
shared by different subsystems (e.g. accelerator and device emulation,
or frontend and backend), and I'm pretty sure that it could be made even
smaller if someone was inclined to try.

Paolo



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 2/2] configure: Force the C standard to gnu99

On 09/01/19 19:10, Philippe Mathieu-Daudé wrote:
> Using '-std=gnu++98' for g++ v4.8 looks like a good compromise to the
> issues Daniel mentioned (still 'experimental').
> 
> Reviewed-by: Philippe Mathieu-Daudé 

C++11 has many new features that almost make it an entirely different
language (the main being rvalue references, and type inference with auto
and decltype).  If it works for GCC 4.8, I would prefer using that
instead of g++98.

Paolo

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

On 1/9/19 3:20 PM, Paolo Bonzini wrote:
> On 09/01/19 18:28, Daniel P. Berrangé wrote:
>>> so both files include each other, how nice ...
>> If the header files are mutually dependent it makes me wonder what the
>> point of having them split up is ?
>>
>> Feels like either they need to be merged, or they need to be split up
>> and refactored even more to remove the mutual dependancy.
> 
> If they include each other only for the typedefs, then prehaps the
> solution is to change the coding style and allow using struct in
> function prototypes.  I'm pretty sure there are several examples of this
> already.

Or stick the typedef in , instead of trying to find (or
create) some other common header.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 2/2] configure: Force the C standard to gnu99

On 1/10/19 3:39 AM, Thomas Huth wrote:
> Different versions of GCC and Clang use different versions of the C standard.
> This repeatedly caused problems already, e.g. with duplicated typedefs:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg05829.html
> 
> or with for-loop variable initializers:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg00237.html
> 
> To avoid these problems, we should enforce the C language version to the
> same level for all compilers. Since our minimum compiler versions is
> GCC v4.8, our best option is "gnu99" for C code right now ("gnu17" is not
> available there yet, and "gnu11" is marked as "experimental"), and "gnu++98"
> for the few C++ code that we have in the repository.
> 
> Signed-off-by: Thomas Huth 
> ---
>  v3: Compile C++ code with -std=gnu++98

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

On 09/01/19 18:28, Daniel P. Berrangé wrote:
>> so both files include each other, how nice ...
> If the header files are mutually dependent it makes me wonder what the
> point of having them split up is ?
> 
> Feels like either they need to be merged, or they need to be split up
> and refactored even more to remove the mutual dependancy.

If they include each other only for the typedefs, then prehaps the
solution is to change the coding style and allow using struct in
function prototypes.  I'm pretty sure there are several examples of this
already.

Paolo

Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot

2019-01-09 Thread Maran Wilson


On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:

On 1/9/19 6:53 AM, Stefano Garzarella wrote:

Hi Liam,

On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick  wrote:

QEMU sets the hvm_modlist_entry in load_linux() after the call to
load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()

But the current PVH patches don't handle initrd (they have
start_info.nr_modules == 1).

Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
 /* The first module is always ramdisk. */
 if (pvh_start_info.nr_modules) {
 struct hvm_modlist_entry *modaddr =
 __va(pvh_start_info.modlist_paddr);
 pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
 pvh_bootparams.hdr.ramdisk_size = modaddr->size;
 }

So, putting start_info.nr_modules = 1, means that the first
hvm_modlist_entry should have the ramdisk paddr and size. Is it
correct?


That's my understanding.

I think what's missing, is that we just need Qemu or qboot/seabios to 
properly populate the pvh_start_info.modlist_paddr with the address (as 
usable by the guest) of the hvm_modlist_entry which correctly defines 
the details of the initrd that has already been loaded into memory that 
is accessible by the guest.


-Maran


During (or after) the call to load_elfboot() it looks like we'd need to
do something like what load_multiboot() does below (along with the
associated initialisation)

400 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
401 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
402 fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
403  sizeof(bootinfo));


In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
varibales to the guest, maybe we could add something like what
linux_load() does:

 /* load initrd */
 if (initrd_filename) {
 ...
 initrd_addr = (initrd_max-initrd_size) & ~4095;

 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
 ...
 }

Then we can load the initrd in qboot or in the optionrom that I'm writing.

What do you think?


Why not specify this in pvh_start_info? This will be much faster for
everyone, no need to go through fw_cfg.

-boris

Re: [Qemu-devel] Internship idea: I2C passthrough

On 09/01/19 19:23, BALATON Zoltan wrote:
>> '''Summary:''' Implement I2C bus passthrough on Linux hosts so that
>> emulated Raspberry Pi or micro:bit boards can talk to real I2C
>> devices. 
>
> How about implementing USB for raspi boards instead? Or is there a
> project proposal (or even some code somewhere out of tree) for that
> already? Seems more useful to allow having emulated keyboard and mouse
> to use Raspbian for example or pass through USB devices. (I don't know
> anything about this was just reminded to a recent discussion by this
> message. If it's not relevant just disregard my comment and sorry for
> the noise.)

That's also a valid idea.  However, Stefan has misinterpreted the
reference to the raspi that I quickly made on IRC.  The idea of I2C
passthrough is that a single board computer like a Raspberry Pi can be
used to develop application for a microcontroller such as micro:bit,
without having to flash the firmware again and again, etc.

Real I2C devices---for example a GPIO extender, a temperature sensor or
a real-time clock---would be attached to the I2C bus on the single board
computer's GPIO headers, exposed as an I2C device through /dev/i2c, and
then driven by the emulated micro:bit through its own I2C controller (or
bitbanged).

(I've now updated the project idea).

Paolo

Re: [Qemu-devel] [PATCH v1 5/6] scripts/archive-source: include softfloat tests

On 1/9/19 3:21 AM, Alex Bennée wrote:
> We need these if we want to run unit/softfloat tests in our docker
> containers.
> 
> Signed-off-by: Alex Bennée 
> ---
>  scripts/archive-source.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v1 4/6] tests/Makefile: add floating point tests

On 1/9/19 3:21 AM, Alex Bennée wrote:
> + le_quite)

quiet.  Several instances.


r~

Re: [Qemu-devel] [PATCH for-3.2 01/11] vhost-user: define conventions for vhost-user backends

Hi

On Wed, Jan 9, 2019 at 12:45 PM Hoffmann, Gerd  wrote:
>
>   Hi,
>
> > Unfortunately, vdev is not set before vhost_dev_start().
> >
> > We could add the migration blocker there somehow?
>
> Sure.  Just use migrate_add_blocker() to do that at any time (see qxl.c
> for an example).

VhostUserInput inherits from VirtioInput, which implements vmsd.

The "Add vhost-user-input-pci" patch override the DeviceClass vmsd to
set it as unmigratable. If I understand correctly, Michael suggested
to add a check for device vmsd == NULL in hw/virtio/vhost-user.c
instead.

However, vhost-user devices would still need to overwrite vmsd to NULL.

I don't think there is a benefit in the generic vmsd == NULL check, as
you still need not to forget to overwrite vmsd to NULL in the
vhost-user device.

Am I missing something?

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH 0/4] tcg: support heterogenous CPU clusters

On 1/9/19 3:30 AM, Peter Maydell wrote:
> Peter Maydell (4):
>   hw/arm/xlx-zynqmp: Realize cluster after putting RPUs in it
>   qom/cpu: Add cluster_index to CPUState
>   accel/tcg: Add cluster number to TCG TB hash
>   gdbstub: Simplify gdb_get_cpu_pid() to use cpu->cluster_index

Reviewed-by: Richard Henderson 


r~

[Qemu-devel] [PATCH] hw/pvrdma: Make function pvrdma_qp_send/recv return void.

The functions handles errors internaly, callers have nothing to do with
the return value.

Signed-off-by: Yuval Shaia 
---
 hw/rdma/vmw/pvrdma_qp_ops.c | 14 ++
 hw/rdma/vmw/pvrdma_qp_ops.h |  4 ++--
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/hw/rdma/vmw/pvrdma_qp_ops.c b/hw/rdma/vmw/pvrdma_qp_ops.c
index 0565eba981..ce5a60e184 100644
--- a/hw/rdma/vmw/pvrdma_qp_ops.c
+++ b/hw/rdma/vmw/pvrdma_qp_ops.c
@@ -143,7 +143,7 @@ int pvrdma_qp_ops_init(void)
 return 0;
 }
 
-int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
+void pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
 {
 RdmaRmQP *qp;
 PvrdmaSqWqe *wqe;
@@ -155,7 +155,8 @@ int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
 
 qp = rdma_rm_get_qp(&dev->rdma_dev_res, qp_handle);
 if (unlikely(!qp)) {
-return -EINVAL;
+pr_dbg("Invalid qpn\n");
+return;
 }
 
 ring = (PvrdmaRing *)qp->opaque;
@@ -212,11 +213,9 @@ int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
 
 wqe = pvrdma_ring_next_elem_read(ring);
 }
-
-return 0;
 }
 
-int pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle)
+void pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle)
 {
 RdmaRmQP *qp;
 PvrdmaRqWqe *wqe;
@@ -226,7 +225,8 @@ int pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle)
 
 qp = rdma_rm_get_qp(&dev->rdma_dev_res, qp_handle);
 if (unlikely(!qp)) {
-return -EINVAL;
+pr_dbg("Invalid qpn\n");
+return;
 }
 
 ring = &((PvrdmaRing *)qp->opaque)[1];
@@ -262,8 +262,6 @@ int pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle)
 
 wqe = pvrdma_ring_next_elem_read(ring);
 }
-
-return 0;
 }
 
 void pvrdma_cq_poll(RdmaDeviceResources *dev_res, uint32_t cq_handle)
diff --git a/hw/rdma/vmw/pvrdma_qp_ops.h b/hw/rdma/vmw/pvrdma_qp_ops.h
index ac46bf7fdf..31cb48ba29 100644
--- a/hw/rdma/vmw/pvrdma_qp_ops.h
+++ b/hw/rdma/vmw/pvrdma_qp_ops.h
@@ -20,8 +20,8 @@
 
 int pvrdma_qp_ops_init(void);
 void pvrdma_qp_ops_fini(void);
-int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle);
-int pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle);
+void pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle);
+void pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle);
 void pvrdma_cq_poll(RdmaDeviceResources *dev_res, uint32_t cq_handle);
 
 #endif
-- 
2.17.2

[Qemu-devel] [PATCH] hw/rdma: Delete unused struct member

This member is used only in init_device_caps function, make it local.

Signed-off-by: Yuval Shaia 
---
 hw/rdma/rdma_backend.c  | 26 ++
 hw/rdma/rdma_backend_defs.h |  1 -
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index 16dca69ee9..b49edaacaf 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -917,23 +917,25 @@ void rdma_backend_destroy_qp(RdmaBackendQP *qp)
 static int init_device_caps(RdmaBackendDev *backend_dev,
 struct ibv_device_attr *dev_attr)
 {
-if (ibv_query_device(backend_dev->context, &backend_dev->dev_attr)) {
+struct ibv_device_attr bk_dev_attr;
+
+if (ibv_query_device(backend_dev->context, &bk_dev_attr)) {
 return -EIO;
 }
 
 dev_attr->max_sge = MAX_SGE;
 
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_mr_size, "%" PRId64);
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_qp, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_sge, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_qp_wr, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_cq, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_cqe, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_mr, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_pd, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_qp_rd_atom, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_qp_init_rd_atom, "%d");
-CHK_ATTR(dev_attr, backend_dev->dev_attr, max_ah, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_mr_size, "%" PRId64);
+CHK_ATTR(dev_attr, bk_dev_attr, max_qp, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_sge, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_qp_wr, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_cq, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_cqe, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_mr, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_pd, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_qp_rd_atom, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_qp_init_rd_atom, "%d");
+CHK_ATTR(dev_attr, bk_dev_attr, max_ah, "%d");
 
 return 0;
 }
diff --git a/hw/rdma/rdma_backend_defs.h b/hw/rdma/rdma_backend_defs.h
index 1e5c3dd3bf..15ae8b970e 100644
--- a/hw/rdma/rdma_backend_defs.h
+++ b/hw/rdma/rdma_backend_defs.h
@@ -41,7 +41,6 @@ typedef struct RdmaCmMux {
 } RdmaCmMux;
 
 typedef struct RdmaBackendDev {
-struct ibv_device_attr dev_attr;
 RdmaBackendThread comp_thread;
 PCIDevice *dev;
 RdmaDeviceResources *rdma_dev_res;
-- 
2.17.2

[Qemu-devel] [PATCH] hw/pvrdma: Post CQE when receive invalid gid index

This error should propagate back to guest.

Signed-off-by: Yuval Shaia 
---
 hw/rdma/rdma_backend.h  | 1 +
 hw/rdma/vmw/pvrdma_qp_ops.c | 6 --
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/rdma/rdma_backend.h b/hw/rdma/rdma_backend.h
index a9ba40ae48..5114c90e67 100644
--- a/hw/rdma/rdma_backend.h
+++ b/hw/rdma/rdma_backend.h
@@ -32,6 +32,7 @@
 #define VENDOR_ERR_INVLKEY  0x207
 #define VENDOR_ERR_MR_SMALL 0x208
 #define VENDOR_ERR_INV_MAD_BUFF 0x209
+#define VENDOR_ERR_INV_GID_IDX  0x210
 
 /* Add definition for QP0 and QP1 as there is no userspace enums for them */
 enum ibv_special_qp_type {
diff --git a/hw/rdma/vmw/pvrdma_qp_ops.c b/hw/rdma/vmw/pvrdma_qp_ops.c
index 465bee8641..0565eba981 100644
--- a/hw/rdma/vmw/pvrdma_qp_ops.c
+++ b/hw/rdma/vmw/pvrdma_qp_ops.c
@@ -178,7 +178,8 @@ int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
 sgid = rdma_rm_get_gid(&dev->rdma_dev_res, 
wqe->hdr.wr.ud.av.gid_index);
 if (!sgid) {
 pr_dbg("Fail to get gid for idx %d\n", 
wqe->hdr.wr.ud.av.gid_index);
-return -EIO;
+complete_with_error(VENDOR_ERR_INV_GID_IDX, comp_ctx);
+continue;
 }
 pr_dbg("sgid_id=%d, sgid=0x%llx\n", wqe->hdr.wr.ud.av.gid_index,
sgid->global.interface_id);
@@ -189,7 +190,8 @@ int pvrdma_qp_send(PVRDMADev *dev, uint32_t qp_handle)
 if (sgid_idx <= 0) {
 pr_dbg("Fail to get bk sgid_idx for sgid_idx %d\n",
wqe->hdr.wr.ud.av.gid_index);
-return -EIO;
+complete_with_error(VENDOR_ERR_INV_GID_IDX, comp_ctx);
+continue;
 }
 
 if (wqe->hdr.num_sge > dev->dev_attr.max_sge) {
-- 
2.17.2

Re: [Qemu-devel] [qemu-s390x] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2019-01-09 Thread Jason J. Herne


On 1/9/19 1:34 PM, Cornelia Huck wrote:

On Wed, 9 Jan 2019 13:10:26 -0500
"Jason J. Herne"  wrote:


On 1/7/19 2:02 PM, Jason J. Herne wrote:

+
   /*
    * sense-id response buffer layout
    */
@@ -205,6 +265,61 @@ typedef struct senseid {
   struct ciw ciw[62];
   }  __attribute__ ((packed, aligned(4))) SenseId;
+/* architected values for first sense byte */
+#define SNS0_CMD_REJECT 0x80
+#define SNS0_INTERVENTION_REQ   0x40
+#define SNS0_BUS_OUT_CHECK  0x20
+#define SNS0_EQUIPMENT_CHECK0x10
+#define SNS0_DATA_CHECK 0x08
+#define SNS0_OVERRUN0x04
+#define SNS0_INCOMPL_DOMAIN 0x01


IIRC, only byte 0 is device independent, and the others below are
(ECKD) dasd specific?
  

+
+/* architectured values for second sense byte */
+#define SNS1_PERM_ERR   0x80
+#define SNS1_INV_TRACK_FORMAT   0x40
+#define SNS1_EOC0x20
+#define SNS1_MESSAGE_TO_OPER0x10
+#define SNS1_NO_REC_FOUND   0x08
+#define SNS1_FILE_PROTECTED 0x04
+#define SNS1_WRITE_INHIBITED0x02
+#define SNS1_INPRECISE_END  0x01
+
+/* architectured values for third sense byte */
+#define SNS2_REQ_INH_WRITE  0x80
+#define SNS2_CORRECTABLE0x40
+#define SNS2_FIRST_LOG_ERR  0x20
+#define SNS2_ENV_DATA_PRESENT   0x10
+#define SNS2_INPRECISE_END  0x04
+
+/* 24-byte Sense fmt/msg codes */
+#define SENSE24_FMT_PROG_SYS0x0
+#define SENSE24_FMT_EQUIPMENT   0x2
+#define SENSE24_FMT_CONTROLLER  0x3
+#define SENSE24_FMT_MISC0xF
+
+#define SENSE24_FMT0_MSG_RESET_NOTIFICATION 0x16
+
+/* basic sense response buffer layout */
+typedef struct senseData {
+uint8_t status[3];
+uint8_t res_count;
+uint8_t phys_drive_id;
+uint8_t low_cyl_addr;
+uint8_t head_high_cyl_addr;
+uint8_t fmt_msg;
+uint64_t fmt_dependent_info[2];
+uint8_t reserved;
+uint8_t program_action_code;
+uint16_t config_info;
+uint8_t mcode_hicyl;
+uint8_t cyl_head_addr[3];
+}  __attribute__ ((packed, aligned(4))) SenseData;


And this looks _really_ dasd specific.
  


Yep, I glossed over those details while I was furiously tracking down the reset 
bug. I'll
take a look at redesigning this.


All of my information for creating these data structures came from an internal 
ECKD DASD
reference. There are probably some things that could stand a bit of cleanup or 
renaming.
Aside from that, considering this is in a DASD only (ECKD DASD only at the 
moment) code
path are you okay with my renaming the struct to senseDataECKD or something 
similar?


Renaming this makes sense.


I'm not sure what value there is in abstracting sense at the moment. I'm not 
even sure
what other device's sense data looks like. Since my description of the SENSE 
CCW comes
from an ECKD reference I have not been able to verify any areas of the data 
that are
common across device types. Thoughts?


There's SA22-7204-01 ("Common I/O Device Commands"), which is from 1992
(this is what I have on my disk -- is there anything newer?). It
specifies what bits 0-5 of byte 0 mean and states that bytes 1-31 are
optional and device-specific.

Maybe some other bits have been specified after 1992, but I have not
come across documentation for them.


That publication is no longer available. According to my quick research it has been 
replaced by an internal only publication. I'll see what I can find.



--
-- Jason J. Herne (jjhe...@linux.ibm.com)

Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot

2019-01-09 Thread Boris Ostrovsky

On 1/9/19 6:53 AM, Stefano Garzarella wrote:
> Hi Liam,
>
> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick  wrote:
>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>
>> But the current PVH patches don't handle initrd (they have
>> start_info.nr_modules == 1).
> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
> /* The first module is always ramdisk. */
> if (pvh_start_info.nr_modules) {
> struct hvm_modlist_entry *modaddr =
> __va(pvh_start_info.modlist_paddr);
> pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
> pvh_bootparams.hdr.ramdisk_size = modaddr->size;
> }
>
> So, putting start_info.nr_modules = 1, means that the first
> hvm_modlist_entry should have the ramdisk paddr and size. Is it
> correct?
>
>
>> During (or after) the call to load_elfboot() it looks like we'd need to
>> do something like what load_multiboot() does below (along with the
>> associated initialisation)
>>
>> 400 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
>> 401 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
>> 402 fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
>> 403  sizeof(bootinfo));
>>
> In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
> varibales to the guest, maybe we could add something like what
> linux_load() does:
>
> /* load initrd */
> if (initrd_filename) {
> ...
> initrd_addr = (initrd_max-initrd_size) & ~4095;
>
> fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, 
> initrd_size);
> ...
> }
>
> Then we can load the initrd in qboot or in the optionrom that I'm writing.
>
> What do you think?


Why not specify this in pvh_start_info? This will be much faster for
everyone, no need to go through fw_cfg.

-boris

[Qemu-devel] [PATCH] hw/pvrdma: Remove max-sge command-line param

This parameter has no effect, fix it.

The function init_dev_caps sets the front-end's max-sge to MAX_SGE. Then
it checks backend's max-sge and adjust it accordingly (we can't send
more than what the device supports).

On send and recv we need to make sure the num_sge in the WQE does not
exceeds the backend device capability.
This check is done in pvrdma level so check on rdma level is deleted.

Signed-off-by: Yuval Shaia 
---
 docs/pvrdma.txt |  1 -
 hw/rdma/rdma_backend.c  | 23 ++-
 hw/rdma/rdma_backend.h  | 11 +++
 hw/rdma/vmw/pvrdma_main.c   | 10 +-
 hw/rdma/vmw/pvrdma_qp_ops.c | 24 
 5 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/docs/pvrdma.txt b/docs/pvrdma.txt
index 5175251b47..5973c0c68b 100644
--- a/docs/pvrdma.txt
+++ b/docs/pvrdma.txt
@@ -153,7 +153,6 @@ Ethernet function can be used for other Ethernet purposes 
such as IP.
   specify the port to use. If not set 1 will be used.
 - dev-caps-max-mr-size: The maximum size of MR.
 - dev-caps-max-qp:  Maximum number of QPs.
-- dev-caps-max-sge: Maximum number of SGE elements in WR.
 - dev-caps-max-cq:  Maximum number of CQs.
 - dev-caps-max-mr:  Maximum number of MRs.
 - dev-caps-max-pd:  Maximum number of PDs.
diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index c28bfbd44d..16dca69ee9 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -32,17 +32,6 @@
 #include "rdma_rm.h"
 #include "rdma_backend.h"
 
-/* Vendor Errors */
-#define VENDOR_ERR_FAIL_BACKEND 0x201
-#define VENDOR_ERR_TOO_MANY_SGES0x202
-#define VENDOR_ERR_NOMEM0x203
-#define VENDOR_ERR_QP0  0x204
-#define VENDOR_ERR_INV_NUM_SGE  0x205
-#define VENDOR_ERR_MAD_SEND 0x206
-#define VENDOR_ERR_INVLKEY  0x207
-#define VENDOR_ERR_MR_SMALL 0x208
-#define VENDOR_ERR_INV_MAD_BUFF 0x209
-
 #define THR_NAME_LEN 16
 #define THR_POLL_TO  5000
 
@@ -475,11 +464,6 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 }
 
 pr_dbg("num_sge=%d\n", num_sge);
-if (!num_sge || num_sge > MAX_SGE) {
-pr_dbg("invalid num_sge=%d\n", num_sge);
-complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_INV_NUM_SGE, ctx);
-return;
-}
 
 bctx = g_malloc0(sizeof(*bctx));
 bctx->up_ctx = ctx;
@@ -602,11 +586,6 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 }
 
 pr_dbg("num_sge=%d\n", num_sge);
-if (!num_sge || num_sge > MAX_SGE) {
-pr_dbg("invalid num_sge=%d\n", num_sge);
-complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_INV_NUM_SGE, ctx);
-return;
-}
 
 bctx = g_malloc0(sizeof(*bctx));
 bctx->up_ctx = ctx;
@@ -942,6 +921,8 @@ static int init_device_caps(RdmaBackendDev *backend_dev,
 return -EIO;
 }
 
+dev_attr->max_sge = MAX_SGE;
+
 CHK_ATTR(dev_attr, backend_dev->dev_attr, max_mr_size, "%" PRId64);
 CHK_ATTR(dev_attr, backend_dev->dev_attr, max_qp, "%d");
 CHK_ATTR(dev_attr, backend_dev->dev_attr, max_sge, "%d");
diff --git a/hw/rdma/rdma_backend.h b/hw/rdma/rdma_backend.h
index 8cae40f827..a9ba40ae48 100644
--- a/hw/rdma/rdma_backend.h
+++ b/hw/rdma/rdma_backend.h
@@ -22,6 +22,17 @@
 #include "rdma_rm_defs.h"
 #include "rdma_backend_defs.h"
 
+/* Vendor Errors */
+#define VENDOR_ERR_FAIL_BACKEND 0x201
+#define VENDOR_ERR_TOO_MANY_SGES0x202
+#define VENDOR_ERR_NOMEM0x203
+#define VENDOR_ERR_QP0  0x204
+#define VENDOR_ERR_INV_NUM_SGE  0x205
+#define VENDOR_ERR_MAD_SEND 0x206
+#define VENDOR_ERR_INVLKEY  0x207
+#define VENDOR_ERR_MR_SMALL 0x208
+#define VENDOR_ERR_INV_MAD_BUFF 0x209
+
 /* Add definition for QP0 and QP1 as there is no userspace enums for them */
 enum ibv_special_qp_type {
 IBV_QPT_SMI = 0,
diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
index 838ad8a949..d2bdb5ba8c 100644
--- a/hw/rdma/vmw/pvrdma_main.c
+++ b/hw/rdma/vmw/pvrdma_main.c
@@ -43,7 +43,6 @@ static Property pvrdma_dev_properties[] = {
 DEFINE_PROP_UINT64("dev-caps-max-mr-size", PVRDMADev, dev_attr.max_mr_size,
MAX_MR_SIZE),
 DEFINE_PROP_INT32("dev-caps-max-qp", PVRDMADev, dev_attr.max_qp, MAX_QP),
-DEFINE_PROP_INT32("dev-caps-max-sge", PVRDMADev, dev_attr.max_sge, 
MAX_SGE),
 DEFINE_PROP_INT32("dev-caps-max-cq", PVRDMADev, dev_attr.max_cq, MAX_CQ),
 DEFINE_PROP_INT32("dev-caps-max-mr", PVRDMADev, dev_attr.max_mr, MAX_MR),
 DEFINE_PROP_INT32("dev-caps-max-pd", PVRDMADev, dev_attr.max_pd, MAX_PD),
@@ -549,8 +548,9 @@ static void init_dev_caps(PVRDMADev *dev)
sizeof(struct pvrdma_rq_wqe_hdr));
 
 dev->dev_attr.max_qp_wr = pg_tbl_bytes /
-  (wr_sz + sizeof(struct pvrdma_sge) * MAX_SGE) -
-  TARGET_PAGE_SIZE; /* First page is ring state */
+  (wr_sz + sizeof(struct pvr

Re: [Qemu-devel] Wiki Account Creation [Was Re: [PULL] RISC-V Changes for 3.2, Part 1]

2019-01-09 Thread Palmer Dabbelt


On Tue, 08 Jan 2019 12:35:17 PST (-0800), jcmvb...@gmail.com wrote:

Hi Palmer,

On Tue, Jan 8, 2019 at 11:37 AM Palmer Dabbelt  wrote:

It looks like I don't have an account and I'm supposed to ask on qemu-devel for
one.


I've created an account for you. Will send details in private e-mail.


Thanks!

[Qemu-devel] [PATCH v16 4/6] acpi: build TPM Physical Presence interface

From: Stefan Berger 

The TPM Physical Presence interface consists of an ACPI part, a shared
memory part, and code in the firmware. Users can send messages to the
firmware by writing a code into the shared memory through invoking the
ACPI code. When a reboot happens, the firmware looks for the code and
acts on it by sending sequences of commands to the TPM.

This patch adds the ACPI code. It is similar to the one in EDK2 but doesn't
assume that SMIs are necessary to use. It uses a similar datastructure for
the shared memory as EDK2 does so that EDK2 and SeaBIOS could both make use
of it. I extended the shared memory data structure with an array of 256
bytes, one for each code that could be implemented. The array contains
flags describing the individual codes. This decouples the ACPI implementation
from the firmware implementation.

The underlying TCG specification is accessible from the following page.

https://trustedcomputinggroup.org/tcg-physical-presence-interface-specification/

This patch implements version 1.30.

Signed-off-by: Stefan Berger 
[ Marc-André - ACPI code improvements and windows fixes ]
Signed-off-by: Marc-André Lureau 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 include/hw/acpi/tpm.h |  12 ++
 hw/acpi/tpm.c | 404 ++
 hw/i386/acpi-build.c  |  12 +-
 stubs/tpm.c   |   5 +
 docs/specs/tpm.txt|  83 +
 hw/acpi/Makefile.objs |   1 +
 6 files changed, 514 insertions(+), 3 deletions(-)
 create mode 100644 hw/acpi/tpm.c

diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h
index a6109a97fc..1a2a57a21f 100644
--- a/include/hw/acpi/tpm.h
+++ b/include/hw/acpi/tpm.h
@@ -18,6 +18,8 @@
 
 #include "qemu/units.h"
 #include "hw/registerfields.h"
+#include "hw/acpi/aml-build.h"
+#include "sysemu/tpm.h"
 
 #define TPM_TIS_ADDR_BASE   0xFED4
 #define TPM_TIS_ADDR_SIZE   0x5000
@@ -197,4 +199,14 @@ REG32(CRB_DATA_BUFFER, 0x80)
 #define TPM_PPI_VERSION_NONE0
 #define TPM_PPI_VERSION_1_301
 
+/* whether function is blocked by BIOS settings; bits 0, 1, 2 */
+#define TPM_PPI_FUNC_NOT_IMPLEMENTED (0 << 0)
+#define TPM_PPI_FUNC_BIOS_ONLY   (1 << 0)
+#define TPM_PPI_FUNC_BLOCKED (2 << 0)
+#define TPM_PPI_FUNC_ALLOWED_USR_REQ (3 << 0)
+#define TPM_PPI_FUNC_ALLOWED_USR_NOT_REQ (4 << 0)
+#define TPM_PPI_FUNC_MASK(7 << 0)
+
+void tpm_build_ppi_acpi(TPMIf *tpm, Aml *dev);
+
 #endif /* HW_ACPI_TPM_H */
diff --git a/hw/acpi/tpm.c b/hw/acpi/tpm.c
new file mode 100644
index 00..9f205378f2
--- /dev/null
+++ b/hw/acpi/tpm.c
@@ -0,0 +1,404 @@
+/* Support for generating ACPI TPM tables
+ *
+ * Copyright (C) 2018 IBM, Corp.
+ * Copyright (C) 2018 Red Hat Inc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/acpi/tpm.h"
+
+void tpm_build_ppi_acpi(TPMIf *tpm, Aml *dev)
+{
+Aml *method, *field, *ifctx, *ifctx2, *ifctx3, *func_mask,
+*not_implemented, *pak, *tpm2, *tpm3, *pprm, *pprq, *zero, *one;
+
+if (!object_property_get_bool(OBJECT(tpm), "ppi", &error_abort)) {
+return;
+}
+
+zero = aml_int(0);
+one = aml_int(1);
+func_mask = aml_int(TPM_PPI_FUNC_MASK);
+not_implemented = aml_int(TPM_PPI_FUNC_NOT_IMPLEMENTED);
+
+/*
+ * TPP2 is for the registers that ACPI code used to pass
+ * the PPI code and parameter (PPRQ, PPRM) to the firmware.
+ */
+aml_append(dev,
+   aml_operation_region("TPP2", AML_SYSTEM_MEMORY,
+aml_int(TPM_PPI_ADDR_BASE + 0x100),
+0x5A));
+field = aml_field("TPP2", AML_ANY_ACC, AML_NOLOCK, AML_PRESERVE);
+aml_append(field, aml_named_field("PPIN", 8));
+aml_append(field, aml_named_field("PPIP", 32));
+aml_append(field, aml_named_field("PPRP", 32));
+aml_append(field, aml_named_field("PPRQ", 32));
+aml_append(field, aml_named_field("PPRM", 32));
+aml_append(field, aml_named_field("LPPR", 32));
+aml_append(dev, field);
+pprq = aml_name("PPRQ");
+pprm = aml_name("PPRM");
+
+/*
+ * DerefOf in Windows is broken with SYSTEM_MEMORY.  Use a dynamic
+ * operation region inside of a method for getting FUNC[op].
+ */
+meth

[Qemu-devel] [PATCH v16 5/6] acpi: add ACPI memory clear interface

The interface is described in the "TCG Platform Reset Attack
Mitigation Specification", chapter 6 "ACPI _DSM Function". According
to Laszlo, it's not so easy to implement in OVMF, he suggested to do
it in qemu instead.

See specification documentation for more details, and next commit for
memory clear on reset handling.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 hw/acpi/tpm.c  | 55 ++
 docs/specs/tpm.txt |  2 ++
 2 files changed, 57 insertions(+)

diff --git a/hw/acpi/tpm.c b/hw/acpi/tpm.c
index 9f205378f2..b96459e45b 100644
--- a/hw/acpi/tpm.c
+++ b/hw/acpi/tpm.c
@@ -53,6 +53,16 @@ void tpm_build_ppi_acpi(TPMIf *tpm, Aml *dev)
 pprq = aml_name("PPRQ");
 pprm = aml_name("PPRM");
 
+aml_append(dev,
+   aml_operation_region(
+   "TPP3", AML_SYSTEM_MEMORY,
+   aml_int(TPM_PPI_ADDR_BASE +
+   0x15a /* movv, docs/specs/tpm.txt */),
+   0x1));
+field = aml_field("TPP3", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE);
+aml_append(field, aml_named_field("MOVV", 8));
+aml_append(dev, field);
+
 /*
  * DerefOf in Windows is broken with SYSTEM_MEMORY.  Use a dynamic
  * operation region inside of a method for getting FUNC[op].
@@ -399,6 +409,51 @@ void tpm_build_ppi_acpi(TPMIf *tpm, Aml *dev)
 aml_append(ifctx, aml_return(aml_buffer(1, zerobyte)));
 }
 aml_append(method, ifctx);
+
+/*
+ * "TCG Platform Reset Attack Mitigation Specification 1.00",
+ * Chapter 6 "ACPI _DSM Function"
+ */
+ifctx = aml_if(
+aml_equal(uuid,
+  aml_touuid("376054ED-CC13-4675-901C-4756D7F2D45D")));
+{
+/* standard DSM query function */
+ifctx2 = aml_if(aml_equal(function, zero));
+{
+uint8_t byte_list[1] = { 0x03 }; /* functions 1-2 supported */
+
+aml_append(ifctx2,
+   aml_return(aml_buffer(sizeof(byte_list),
+ byte_list)));
+}
+aml_append(ifctx, ifctx2);
+
+/*
+ * TCG Platform Reset Attack Mitigation Specification 1.0 Ch.6
+ *
+ * Arg 2 (Integer): Function Index = 1
+ * Arg 3 (Package): Arguments = Package: Type: Integer
+ *  Operation Value of the Request
+ * Returns: Type: Integer
+ *  0: Success
+ *  1: General Failure
+ */
+ifctx2 = aml_if(aml_equal(function, one));
+{
+aml_append(ifctx2,
+   aml_store(aml_derefof(aml_index(arguments, zero)),
+ op));
+{
+aml_append(ifctx2, aml_store(op, aml_name("MOVV")));
+
+/* 0: success */
+aml_append(ifctx2, aml_return(zero));
+}
+}
+aml_append(ifctx, ifctx2);
+}
+aml_append(method, ifctx);
 }
 aml_append(dev, method);
 }
diff --git a/docs/specs/tpm.txt b/docs/specs/tpm.txt
index 424d1511fc..5d8c26b1ad 100644
--- a/docs/specs/tpm.txt
+++ b/docs/specs/tpm.txt
@@ -135,6 +135,8 @@ layout:
  +--+++---+
  | next_step|   0x1  |  0x159 | Operation to execute after reboot by  |
  |  ||| firmware. Used by firmware.   |
+ +--+++---+
+ | movv |   0x1  |  0x15a | Memory overwrite variable |
  +--+++---+
 
The following values are supported for the 'func' field. They correspond
-- 
2.20.1.98.gecbdaf0899

[Qemu-devel] [PATCH v16 6/6] tpm: clear RAM when "memory overwrite" requested

Note: the "Platform Reset Attack Mitigation" specification isn't
explicit about NVDIMM, since they could have different usages. It uses
the term "system memory" generally (and also "volatile memory RAM" in
its introduction). For initial support, I propose to consider
non-volatile memory as not being subject to the memory clear. There is
an on-going discussion in the TCG "pcclientwg" working group for
future revisions.

CPU cache clearing is done unconditionally in edk2 since commit
d20ae95a13e851 (edk2-stable201811).

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 hw/tpm/tpm_ppi.h| 10 ++
 hw/tpm/tpm_crb.c|  3 +++
 hw/tpm/tpm_ppi.c| 22 ++
 hw/tpm/tpm_tis.c|  3 +++
 hw/tpm/trace-events |  3 +++
 5 files changed, 41 insertions(+)

diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h
index c5e555fe2c..d33ef27de6 100644
--- a/hw/tpm/tpm_ppi.h
+++ b/hw/tpm/tpm_ppi.h
@@ -33,4 +33,14 @@ typedef struct TPMPPI {
 void tpm_ppi_init(TPMPPI *tpmppi, struct MemoryRegion *m,
   hwaddr addr, Object *obj);
 
+/**
+ * tpm_ppi_reset:
+ * @tpmppi: a TPMPPI
+ *
+ * Function to call on machine reset. It will check if the "Memory
+ * overwrite" variable is set, and perform a memory clear on volatile
+ * memory if requested.
+ **/
+void tpm_ppi_reset(TPMPPI *tpmppi);
+
 #endif /* TPM_TPM_PPI_H */
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index 012ec686d4..3087acc4ab 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -233,6 +233,9 @@ static void tpm_crb_reset(void *dev)
 {
 CRBState *s = CRB(dev);
 
+if (s->ppi_enabled) {
+tpm_ppi_reset(&s->ppi);
+}
 tpm_backend_reset(s->tpmbe);
 
 memset(s->regs, 0, sizeof(s->regs));
diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index cf17779c20..cd8205f212 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -16,8 +16,30 @@
 #include "qapi/error.h"
 #include "cpu.h"
 #include "sysemu/memory_mapping.h"
+#include "sysemu/reset.h"
 #include "migration/vmstate.h"
 #include "tpm_ppi.h"
+#include "trace.h"
+
+void tpm_ppi_reset(TPMPPI *tpmppi)
+{
+if (tpmppi->buf[0x15a /* movv, docs/specs/tpm.txt */] & 0x1) {
+GuestPhysBlockList guest_phys_blocks;
+GuestPhysBlock *block;
+
+guest_phys_blocks_init(&guest_phys_blocks);
+guest_phys_blocks_append(&guest_phys_blocks);
+QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
+trace_tpm_ppi_memset(block->host_addr,
+ block->target_end - block->target_start);
+memset(block->host_addr, 0,
+   block->target_end - block->target_start);
+memory_region_set_dirty(block->mr, 0,
+block->target_end - block->target_start);
+}
+guest_phys_blocks_free(&guest_phys_blocks);
+}
+}
 
 void tpm_ppi_init(TPMPPI *tpmppi, struct MemoryRegion *m,
   hwaddr addr, Object *obj)
diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 02d9d5c911..fd6bb9b59a 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -872,6 +872,9 @@ static void tpm_tis_reset(DeviceState *dev)
 s->be_buffer_size = MIN(tpm_backend_get_buffer_size(s->be_driver),
 TPM_TIS_BUFFER_MAX);
 
+if (s->ppi_enabled) {
+tpm_ppi_reset(&s->ppi);
+}
 tpm_backend_reset(s->be_driver);
 
 s->active_locty = TPM_TIS_NO_LOCALITY;
diff --git a/hw/tpm/trace-events b/hw/tpm/trace-events
index 25bee0cecf..920d32ad55 100644
--- a/hw/tpm/trace-events
+++ b/hw/tpm/trace-events
@@ -51,3 +51,6 @@ tpm_tis_mmio_write_init_abort(void) "Initiating abort"
 tpm_tis_mmio_write_lowering_irq(void) "Lowering IRQ"
 tpm_tis_mmio_write_data2send(uint32_t value, unsigned size) "Data to send to 
TPM: 0x%08x (size=%d)"
 tpm_tis_pre_save(uint8_t locty, uint32_t rw_offset) "locty: %d, rw_offset = %u"
+
+# hw/tpm/tpm_ppi.c
+tpm_ppi_memset(uint8_t *ptr, size_t size) "memset: %p %zu"
-- 
2.20.1.98.gecbdaf0899

[Qemu-devel] [PATCH v16 2/6] tpm: allocate/map buffer for TPM Physical Presence interface

From: Stefan Berger 

Implement a virtual memory device for the TPM Physical Presence interface.
The memory is located at 0xFED45000 and used by ACPI to send messages to the
firmware (BIOS) and by the firmware to provide parameters for each one of
the supported codes.

This interface should be used by all TPM devices on x86 and can be
added by calling tpm_ppi_init_io().

Note: bios_linker cannot be used to allocate the PPI memory region,
since the reserved memory should stay stable across reboots, and might
be needed before the ACPI tables are installed.

Signed-off-by: Stefan Berger 
Signed-off-by: Marc-André Lureau 
Reviewed-by: Igor Mammedov 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 hw/tpm/tpm_ppi.h  | 36 
 include/hw/acpi/tpm.h |  6 ++
 hw/tpm/tpm_crb.c  |  7 +++
 hw/tpm/tpm_ppi.c  | 31 +++
 hw/tpm/tpm_tis.c  |  7 +++
 hw/tpm/Makefile.objs  |  1 +
 6 files changed, 88 insertions(+)
 create mode 100644 hw/tpm/tpm_ppi.h
 create mode 100644 hw/tpm/tpm_ppi.c

diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h
new file mode 100644
index 00..c5e555fe2c
--- /dev/null
+++ b/hw/tpm/tpm_ppi.h
@@ -0,0 +1,36 @@
+/*
+ * TPM Physical Presence Interface
+ *
+ * Copyright (C) 2018 IBM Corporation
+ *
+ * Authors:
+ *  Stefan Berger
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef TPM_TPM_PPI_H
+#define TPM_TPM_PPI_H
+
+#include "hw/acpi/tpm.h"
+#include "exec/address-spaces.h"
+
+typedef struct TPMPPI {
+MemoryRegion ram;
+uint8_t *buf;
+} TPMPPI;
+
+/**
+ * tpm_ppi_init:
+ * @tpmppi: a TPMPPI
+ * @m: the address-space / MemoryRegion to use
+ * @addr: the address of the PPI region
+ * @obj: the owner object
+ *
+ * Register the TPM PPI memory region at @addr on the given address
+ * space for the object @obj.
+ **/
+void tpm_ppi_init(TPMPPI *tpmppi, struct MemoryRegion *m,
+  hwaddr addr, Object *obj);
+
+#endif /* TPM_TPM_PPI_H */
diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h
index 3580ffd50c..b8796df916 100644
--- a/include/hw/acpi/tpm.h
+++ b/include/hw/acpi/tpm.h
@@ -188,4 +188,10 @@ REG32(CRB_DATA_BUFFER, 0x80)
 #define TPM2_START_METHOD_MMIO  6
 #define TPM2_START_METHOD_CRB   7
 
+/*
+ * Physical Presence Interface
+ */
+#define TPM_PPI_ADDR_SIZE   0x400
+#define TPM_PPI_ADDR_BASE   0xFED45000
+
 #endif /* HW_ACPI_TPM_H */
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index d5b0ac5920..012ec686d4 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -29,6 +29,7 @@
 #include "sysemu/reset.h"
 #include "tpm_int.h"
 #include "tpm_util.h"
+#include "tpm_ppi.h"
 #include "trace.h"
 
 typedef struct CRBState {
@@ -43,6 +44,7 @@ typedef struct CRBState {
 size_t be_buffer_size;
 
 bool ppi_enabled;
+TPMPPI ppi;
 } CRBState;
 
 #define CRB(obj) OBJECT_CHECK(CRBState, (obj), TYPE_TPM_CRB)
@@ -294,6 +296,11 @@ static void tpm_crb_realize(DeviceState *dev, Error **errp)
 memory_region_add_subregion(get_system_memory(),
 TPM_CRB_ADDR_BASE + sizeof(s->regs), &s->cmdmem);
 
+if (s->ppi_enabled) {
+tpm_ppi_init(&s->ppi, get_system_memory(),
+ TPM_PPI_ADDR_BASE, OBJECT(s));
+}
+
 qemu_register_reset(tpm_crb_reset, dev);
 }
 
diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
new file mode 100644
index 00..cf17779c20
--- /dev/null
+++ b/hw/tpm/tpm_ppi.c
@@ -0,0 +1,31 @@
+/*
+ * tpm_ppi.c - TPM Physical Presence Interface
+ *
+ * Copyright (C) 2018 IBM Corporation
+ *
+ * Authors:
+ *  Stefan Berger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "cpu.h"
+#include "sysemu/memory_mapping.h"
+#include "migration/vmstate.h"
+#include "tpm_ppi.h"
+
+void tpm_ppi_init(TPMPPI *tpmppi, struct MemoryRegion *m,
+  hwaddr addr, Object *obj)
+{
+tpmppi->buf = g_malloc0(HOST_PAGE_ALIGN(TPM_PPI_ADDR_SIZE));
+memory_region_init_ram_device_ptr(&tpmppi->ram, obj, "tpm-ppi",
+  TPM_PPI_ADDR_SIZE, tpmppi->buf);
+vmstate_register_ram(&tpmppi->ram, DEVICE(obj));
+
+memory_region_add_subregion(m, addr, &tpmppi->ram);
+}
diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 1698d83cd3..02d9d5c911 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -31,6 +31,7 @@
 #include "sysemu/tpm_backend.h"
 #include "tpm_int.h"
 #include "tpm_util.h"
+#include "tpm_ppi.h"
 #include "trace.h"
 
 #define TPM_TIS_NUM_LOCALITIES  5 /* per spec */
@@ -83,6 +84,7 @@ typedef struct TPMState {
 size_t be_buffer_size;
 
 bool ppi_enabled;
+TPMPPI ppi;
 } TPMState;
 
 #define TPM(obj) OBJECT_CHECK(TPMState, (obj), TYPE_TPM_TIS)
@@ -

[Qemu-devel] [PATCH v16 3/6] acpi: expose TPM/PPI configuration parameters to firmware via fw_cfg

From: Stefan Berger 

To avoid having to hard code the base address of the PPI virtual
memory device we introduce a fw_cfg file etc/tpm/config that holds the
base address of the PPI device, the version of the PPI interface and
the version of the attached TPM.

Signed-off-by: Stefan Berger 
[ Marc-André: renamed to etc/tpm/config, made it static, document it ]
Signed-off-by: Marc-André Lureau 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 include/hw/acpi/tpm.h |  3 +++
 hw/i386/acpi-build.c  | 19 +++
 docs/specs/tpm.txt| 19 +++
 3 files changed, 41 insertions(+)

diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h
index b8796df916..a6109a97fc 100644
--- a/include/hw/acpi/tpm.h
+++ b/include/hw/acpi/tpm.h
@@ -194,4 +194,7 @@ REG32(CRB_DATA_BUFFER, 0x80)
 #define TPM_PPI_ADDR_SIZE   0x400
 #define TPM_PPI_ADDR_BASE   0xFED45000
 
+#define TPM_PPI_VERSION_NONE0
+#define TPM_PPI_VERSION_1_301
+
 #endif /* HW_ACPI_TPM_H */
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 14f757fc36..9898247705 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -119,6 +119,12 @@ typedef struct AcpiBuildPciBusHotplugState {
 bool pcihp_bridge_en;
 } AcpiBuildPciBusHotplugState;
 
+typedef struct FwCfgTPMConfig {
+uint32_t tpmppi_address;
+uint8_t tpm_version;
+uint8_t tpmppi_version;
+} QEMU_PACKED FwCfgTPMConfig;
+
 static void init_common_fadt_data(Object *o, AcpiFadtData *data)
 {
 uint32_t io = object_property_get_uint(o, ACPI_PM_PROP_PM_IO_BASE, NULL);
@@ -2847,6 +2853,8 @@ void acpi_setup(void)
 AcpiBuildTables tables;
 AcpiBuildState *build_state;
 Object *vmgenid_dev;
+TPMIf *tpm;
+static FwCfgTPMConfig tpm_config;
 
 if (!pcms->fw_cfg) {
 ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n");
@@ -2881,6 +2889,17 @@ void acpi_setup(void)
 fw_cfg_add_file(pcms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
 tables.tcpalog->data, acpi_data_len(tables.tcpalog));
 
+tpm = tpm_find();
+if (tpm && object_property_get_bool(OBJECT(tpm), "ppi", &error_abort)) {
+tpm_config = (FwCfgTPMConfig) {
+.tpmppi_address = cpu_to_le32(TPM_PPI_ADDR_BASE),
+.tpm_version = tpm_get_version(tpm),
+.tpmppi_version = TPM_PPI_VERSION_NONE
+};
+fw_cfg_add_file(pcms->fw_cfg, "etc/tpm/config",
+&tpm_config, sizeof tpm_config);
+}
+
 vmgenid_dev = find_vmgenid_dev();
 if (vmgenid_dev) {
 vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), pcms->fw_cfg,
diff --git a/docs/specs/tpm.txt b/docs/specs/tpm.txt
index 1af82bba86..e4bb094700 100644
--- a/docs/specs/tpm.txt
+++ b/docs/specs/tpm.txt
@@ -34,6 +34,25 @@ The CRB interface makes a memory mapped IO region in the 
area 0xfed4 -
 QEMU files related to TPM CRB interface:
  - hw/tpm/tpm_crb.c
 
+= fw_cfg interface =
+
+The bios/firmware may read the "etc/tpm/config" fw_cfg entry for
+configuring the guest appropriately.
+
+The entry of 6 bytes has the following content, in little-endian:
+
+#define TPM_VERSION_UNSPEC  0
+#define TPM_VERSION_1_2 1
+#define TPM_VERSION_2_0 2
+
+#define TPM_PPI_VERSION_NONE0
+#define TPM_PPI_VERSION_1_301
+
+struct FwCfgTPMConfig {
+uint32_t tpmppi_address; /* PPI memory location */
+uint8_t tpm_version; /* TPM version */
+uint8_t tpmppi_version;  /* PPI version */
+};
 
 = ACPI Interface =
 
-- 
2.20.1.98.gecbdaf0899

[Qemu-devel] [PATCH v16 1/6] tpm: add a "ppi" boolean property

The following patches implement the TPM Physical Presence Interface,
make use of a new memory region and a fw_cfg entry. Enable PPI by
default with >=4.0 machine type, to avoid migration issues.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Igor Mammedov 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Michael S. Tsirkin 
Tested-by: Stefan Berger 
---
 hw/core/machine.c | 8 
 hw/tpm/tpm_crb.c  | 3 +++
 hw/tpm/tpm_tis.c  | 3 +++
 3 files changed, 14 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index f8563efb86..92519637cb 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -40,6 +40,14 @@ GlobalProperty hw_compat_3_1[] = {
 .driver   = "memory-backend-memfd",
 .property = "x-use-canonical-path-for-ramblock-id",
 .value= "true",
+},{
+.driver   = "tpm-crb",
+.property = "ppi",
+.value= "false",
+},{
+.driver   = "tpm-tis",
+.property = "ppi",
+.value= "false",
 },
 };
 const size_t hw_compat_3_1_len = G_N_ELEMENTS(hw_compat_3_1);
diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c
index a92dd50437..d5b0ac5920 100644
--- a/hw/tpm/tpm_crb.c
+++ b/hw/tpm/tpm_crb.c
@@ -41,6 +41,8 @@ typedef struct CRBState {
 MemoryRegion cmdmem;
 
 size_t be_buffer_size;
+
+bool ppi_enabled;
 } CRBState;
 
 #define CRB(obj) OBJECT_CHECK(CRBState, (obj), TYPE_TPM_CRB)
@@ -221,6 +223,7 @@ static const VMStateDescription vmstate_tpm_crb = {
 
 static Property tpm_crb_properties[] = {
 DEFINE_PROP_TPMBE("tpmdev", CRBState, tpmbe),
+DEFINE_PROP_BOOL("ppi", CRBState, ppi_enabled, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 2563d7501f..1698d83cd3 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -81,6 +81,8 @@ typedef struct TPMState {
 TPMVersion be_tpm_version;
 
 size_t be_buffer_size;
+
+bool ppi_enabled;
 } TPMState;
 
 #define TPM(obj) OBJECT_CHECK(TPMState, (obj), TYPE_TPM_TIS)
@@ -954,6 +956,7 @@ static const VMStateDescription vmstate_tpm_tis = {
 static Property tpm_tis_properties[] = {
 DEFINE_PROP_UINT32("irq", TPMState, irq_num, TPM_TIS_IRQ),
 DEFINE_PROP_TPMBE("tpmdev", TPMState, be_driver),
+DEFINE_PROP_BOOL("ppi", TPMState, ppi_enabled, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.20.1.98.gecbdaf0899

[Qemu-devel] [PATCH v16 0/6] Add support for TPM Physical Presence interface

Hi,

The following patches implement the TPM Physical Presence Interface
that allows a user to set a command via ACPI (sysfs entry in Linux)
that, upon the next reboot, the firmware looks for and acts upon by
sending sequences of commands to the TPM.

A dedicated memory region is added to the TPM CRB & TIS devices, at
address/size 0xFED45000/0x400. A new "etc/tpm/config" fw_cfg entry
holds the location for that PPI region and some version details, to
allow for future flexibility.

With the associated edk2/ovmf firmware, the Windows HLK "PPI 1.3" test
now runs successfully.

It is based on previous work from Stefan Berger ("[PATCH v2 0/4]
Implement Physical Presence interface for TPM 1.2 and 2")

The edk2 support is merged upstream.

v16:
- minor comments and stylistic changes
- add r-b & t-b tags

v15:
- fix crash on reset when PPI is disabled

v14:
- rebased, fixing conflicts after compat-props refactoring
- fix build regression from v13 with --disable-tpm

v13:
- removed needless error handling in tpm_ppi_init()
- splitted "add ACPI memory clear interface"
- moved acpi build function in dedicated hw/acpi/tpm.c
- added some function documentation in headers
- various code cleanups suggested by Philippe
- rebased

Marc-André Lureau (3):
  tpm: add a "ppi" boolean property
  acpi: add ACPI memory clear interface
  tpm: clear RAM when "memory overwrite" requested

Stefan Berger (3):
  tpm: allocate/map buffer for TPM Physical Presence interface
  acpi: expose TPM/PPI configuration parameters to firmware via fw_cfg
  acpi: build TPM Physical Presence interface

 hw/tpm/tpm_ppi.h  |  46 +
 include/hw/acpi/tpm.h |  21 ++
 hw/acpi/tpm.c | 459 ++
 hw/core/machine.c |   8 +
 hw/i386/acpi-build.c  |  29 ++-
 hw/tpm/tpm_crb.c  |  13 ++
 hw/tpm/tpm_ppi.c  |  53 +
 hw/tpm/tpm_tis.c  |  13 ++
 stubs/tpm.c   |   5 +
 docs/specs/tpm.txt| 104 ++
 hw/acpi/Makefile.objs |   1 +
 hw/tpm/Makefile.objs  |   1 +
 hw/tpm/trace-events   |   3 +
 13 files changed, 754 insertions(+), 2 deletions(-)
 create mode 100644 hw/tpm/tpm_ppi.h
 create mode 100644 hw/acpi/tpm.c
 create mode 100644 hw/tpm/tpm_ppi.c

-- 
2.20.1.98.gecbdaf0899

Re: [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export

Revisiting an older thread:

On 6/9/18 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:
> Handle new NBD meta namespace: "qemu", and corresponding queries:
> "qemu:dirty-bitmap:".
> 
> With new metadata context negotiated, BLOCK_STATUS query will reply
> with dirty-bitmap data, converted to extents. New public function
> nbd_export_bitmap selects bitmap to export. For now, only one bitmap
> may be exported.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

> @@ -2199,3 +2393,44 @@ void nbd_client_new(NBDExport *exp,
>  co = qemu_coroutine_create(nbd_co_client_start, client);
>  qemu_coroutine_enter(co);
>  }
> +
> +void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
> +   const char *bitmap_export_name, Error **errp)
> +{
> +BdrvDirtyBitmap *bm = NULL;
> +BlockDriverState *bs = blk_bs(exp->blk);
> +
> +if (exp->export_bitmap) {
> +error_setg(errp, "Export bitmap is already set");
> +return;
> +}
> +
> +while (true) {
> +bm = bdrv_find_dirty_bitmap(bs, bitmap);
> +if (bm != NULL || bs->backing == NULL) {
> +break;
> +}
> +
> +bs = bs->backing->bs;
> +}
> +
> +if (bm == NULL) {
> +error_setg(errp, "Bitmap '%s' is not found", bitmap);
> +return;
> +}
> +
> +if (bdrv_dirty_bitmap_enabled(bm)) {
> +error_setg(errp, "Bitmap '%s' is enabled", bitmap);
> +return;
> +}

Why are we restricting things to only export disabled bitmaps?

I can understand the argument that if the image being exported is
read-only, then an enabled bitmap _that can be changed_ is probably a
bad idea (it goes against the notion of the export being read only).
But if we were to allow a writable access to an image, wouldn't we
expect that writes be reflected into the bitmap, which means permitting
an enabled bitmap?

Furthermore, consider the scenario:

backing [with bitmap b0] <- active

If I request a bitmap add of b0 for an export of active, then it really
shouldn't matter whether b0 is left enabled or disabled - the backing
file is read-only, and so no changes will be made to bitmap b0.

I stumbled into the latter scenario while testing my new 'qemu-nbd -B
bitmap', but the problem appears anywhere that we have read-only access
to an image, where we can't change the status of the bitmap from enabled
to disabled (because the image is read-only) but where the bitmap
shouldn't be changing anyways (because the image is read-only).

> +
> +if (bdrv_dirty_bitmap_qmp_locked(bm)) {
> +error_setg(errp, "Bitmap '%s' is locked", bitmap);
> +return;
> +}
> +
> +bdrv_dirty_bitmap_set_qmp_locked(bm, true);

Can we have an enabled-but-locked bitmap (one that we can't delete
because some operation such as a writable NBD export is still using it,
but one that is still being updated by active writes)?  If so, then it
may make sense to drop the restriction that an exported bitmap must be
disabled.  And even if it is not possible, I'd still like to loosen the
restriction to require a disabled bitmap only if the image owning the
bitmap is writable (making it easier to do read-only exports without
having to first tweak the bitmap to be disabled).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2019-01-09 Thread Kevin Wolf

Am 09.01.2019 um 18:52 hat Eric Blake geschrieben:
> On 1/9/19 11:38 AM, Max Reitz wrote:
> 
> > 
> > 
> > Actually, to me what you're saying sounds more like "Our deprecation
> > policy is useless" to which I wholeheartedly agree.

If you restrict it to "Our deprecation policy is useless for user-facing
changes", I might agree. I think the policy is fine for stuff like QMP
where only the management layer needs to be aware of the change. We
still have problems there, but these are not problems of the policy.

> > I think we should only remove things in major releases, and only if
> > it was deprecated in the previous major release already.  (So if you
> > deprecate something in 4.0, you can remove it in 5.0; but if you
> > deprecate it in 4.1, you can remove it only in 6.0.)  From a user
> > standpoint I really think we deprecate stuff too irregularly.
> > 
> 
> That's actually incorrect. Our current version numbering scheme is that
> the major version number is NOT synonymous with major releases: we just
> bump the major version number once per year, and ALL releases are on
> equal footing with no one release being more major than others.  Thus, a
> policy that (at least) 2 releases is needed for a deprecation is
> consistent, where one that requires waiting for a bump in the major
> version number (which is as short as one release and as long as 3, given
> that we bump every year with about 3 releases per year) is the one that
> is less predictable and less meaningful (why is waiting for January
> better than waiting for 2 releases?).

As someone who usually skips distro versions because he wants to have
the change and necessary adaption only once instead of two or three
times in the same timeframe, I can see some value for users in breaking
stable APIs only in defined versions.

At the same time, I can also see the value in being able to break stable
APIs without waiting for almost two years with Max' policy and bad
timing.

For software I develop I prefer the latter, for software that I use I
prefer the former. ;-)

Kevin

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

On 1/9/19 12:51 PM, Kevin Wolf wrote:

>> Indeed, and libvirt IS using 'savevm' via HMP via QMP's
>> human-monitor-command, since there is no QMP counterpart for internal
>> snapshot.  Even though lately we consistently tell people that internal
>> snapshots are underdeveloped and you should use external snapshots, it
>> does not get away from the fact that libvirt has been using 'savevm' to
>> drive internal snapshots for years now, and that we MUST consider
>> back-compat and/or add an introspectible QMP interface before making
>> changes that would break libvirt.
> 
> Okay, so what does libvirt do when you request a snapshot with a
> numerical name? Without having looked at the code, the best case I would
> expect that it forbids them, and more realistically I suspect that we
> may actually fix a bug for libvirt by changing the semantics.
> 
> Or does libvirt really use snapshot IDs rather than names?

At the moment, libvirt does not place any restriction on internal
snapshot names, but passes the user's name through without thought of
whether it is an id or a name.

So yes, arguably tightening the semantics fixes a libvirt bug for
libvirt having allowed internal snapshots to get into an inconsistent
state.  But again, it falls in the category of "properly fixing this
requires a lot of auditing, time, mental power, and unit tests", which
makes it a rather daunting task to state for certain.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2019-01-09 Thread Kevin Wolf

Am 09.01.2019 um 18:55 hat Eric Blake geschrieben:
> On 1/9/19 11:38 AM, Max Reitz wrote:
> 
> > I do think it affects users of HMP, because right now you can delete
> > snapshots with their ID, and after this series you cannot.
> 
> 
> >> This. Is. HMP.
> >>
> >> Not a stable ABI, no deprecation period of two releases.
> > 
> > Well, if you want to do it.
> > 
> > This may be HMP, but this is also the only interface to savevm, so it's
> > not like users have a choice to use a more stable interface.  I know
> > that was a conscious decision, more or less, but I don't see why we need
> > to be so nasty when the hardest thing about doing a nice deprecation
> > would be to remember to make it an error in half a year.
> 
> Indeed, and libvirt IS using 'savevm' via HMP via QMP's
> human-monitor-command, since there is no QMP counterpart for internal
> snapshot.  Even though lately we consistently tell people that internal
> snapshots are underdeveloped and you should use external snapshots, it
> does not get away from the fact that libvirt has been using 'savevm' to
> drive internal snapshots for years now, and that we MUST consider
> back-compat and/or add an introspectible QMP interface before making
> changes that would break libvirt.

Okay, so what does libvirt do when you request a snapshot with a
numerical name? Without having looked at the code, the best case I would
expect that it forbids them, and more realistically I suspect that we
may actually fix a bug for libvirt by changing the semantics.

Or does libvirt really use snapshot IDs rather than names?

Kevin


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 5/5] xfs: disable map_sync for virtio pmem

2019-01-09 Thread Darrick J. Wong

On Wed, Jan 09, 2019 at 08:17:36PM +0530, Pankaj Gupta wrote:
> Virtio pmem provides asynchronous host page cache flush
> mechanism. we don't support 'MAP_SYNC' with virtio pmem 
> and xfs.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  fs/xfs/xfs_file.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index e474250..eae4aa4 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1190,6 +1190,14 @@ xfs_file_mmap(
>   if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
>   return -EOPNOTSUPP;
>  
> + /* We don't support synchronous mappings with guest direct access
> +  * and virtio based host page cache mechanism.
> +  */
> + if (IS_DAX(file_inode(filp)) && virtio_pmem_host_cache_enabled(

Echoing what Jan said, this ought to be some sort of generic function
that tells us whether or not memory mapped from the dax device will
always still be accessible even after a crash (i.e. supports MAP_SYNC).

What if the underlying file on the host is itself on pmem and can be
MAP_SYNC'd?  Shouldn't the guest be able to use MAP_SYNC as well?

--D

> + xfs_find_daxdev_for_inode(file_inode(filp))) &&
> + (vma->vm_flags & VM_SYNC))
> + return -EOPNOTSUPP;
> +
>   file_accessed(filp);
>   vma->vm_ops = &xfs_file_vm_ops;
>   if (IS_DAX(file_inode(filp)))
> -- 
> 2.9.3
>

Re: [Qemu-devel] [qemu-s390x] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2019-01-09 Thread Cornelia Huck

On Wed, 9 Jan 2019 13:10:26 -0500
"Jason J. Herne"  wrote:

> On 1/7/19 2:02 PM, Jason J. Herne wrote:
> >>> +
> >>>   /*
> >>>    * sense-id response buffer layout
> >>>    */
> >>> @@ -205,6 +265,61 @@ typedef struct senseid {
> >>>   struct ciw ciw[62];
> >>>   }  __attribute__ ((packed, aligned(4))) SenseId;
> >>> +/* architected values for first sense byte */
> >>> +#define SNS0_CMD_REJECT 0x80
> >>> +#define SNS0_INTERVENTION_REQ   0x40
> >>> +#define SNS0_BUS_OUT_CHECK  0x20
> >>> +#define SNS0_EQUIPMENT_CHECK0x10
> >>> +#define SNS0_DATA_CHECK 0x08
> >>> +#define SNS0_OVERRUN0x04
> >>> +#define SNS0_INCOMPL_DOMAIN 0x01  
> >>
> >> IIRC, only byte 0 is device independent, and the others below are
> >> (ECKD) dasd specific?
> >>  
> >>> +
> >>> +/* architectured values for second sense byte */
> >>> +#define SNS1_PERM_ERR   0x80
> >>> +#define SNS1_INV_TRACK_FORMAT   0x40
> >>> +#define SNS1_EOC0x20
> >>> +#define SNS1_MESSAGE_TO_OPER0x10
> >>> +#define SNS1_NO_REC_FOUND   0x08
> >>> +#define SNS1_FILE_PROTECTED 0x04
> >>> +#define SNS1_WRITE_INHIBITED0x02
> >>> +#define SNS1_INPRECISE_END  0x01
> >>> +
> >>> +/* architectured values for third sense byte */
> >>> +#define SNS2_REQ_INH_WRITE  0x80
> >>> +#define SNS2_CORRECTABLE0x40
> >>> +#define SNS2_FIRST_LOG_ERR  0x20
> >>> +#define SNS2_ENV_DATA_PRESENT   0x10
> >>> +#define SNS2_INPRECISE_END  0x04
> >>> +
> >>> +/* 24-byte Sense fmt/msg codes */
> >>> +#define SENSE24_FMT_PROG_SYS0x0
> >>> +#define SENSE24_FMT_EQUIPMENT   0x2
> >>> +#define SENSE24_FMT_CONTROLLER  0x3
> >>> +#define SENSE24_FMT_MISC0xF
> >>> +
> >>> +#define SENSE24_FMT0_MSG_RESET_NOTIFICATION 0x16
> >>> +
> >>> +/* basic sense response buffer layout */
> >>> +typedef struct senseData {
> >>> +uint8_t status[3];
> >>> +uint8_t res_count;
> >>> +uint8_t phys_drive_id;
> >>> +uint8_t low_cyl_addr;
> >>> +uint8_t head_high_cyl_addr;
> >>> +uint8_t fmt_msg;
> >>> +uint64_t fmt_dependent_info[2];
> >>> +uint8_t reserved;
> >>> +uint8_t program_action_code;
> >>> +uint16_t config_info;
> >>> +uint8_t mcode_hicyl;
> >>> +uint8_t cyl_head_addr[3];
> >>> +}  __attribute__ ((packed, aligned(4))) SenseData;  
> >>
> >> And this looks _really_ dasd specific.
> >>  
> > 
> > Yep, I glossed over those details while I was furiously tracking down the 
> > reset bug. I'll 
> > take a look at redesigning this.  
> 
> All of my information for creating these data structures came from an 
> internal ECKD DASD 
> reference. There are probably some things that could stand a bit of cleanup 
> or renaming. 
> Aside from that, considering this is in a DASD only (ECKD DASD only at the 
> moment) code 
> path are you okay with my renaming the struct to senseDataECKD or something 
> similar?

Renaming this makes sense.

> I'm not sure what value there is in abstracting sense at the moment. I'm not 
> even sure 
> what other device's sense data looks like. Since my description of the SENSE 
> CCW comes 
> from an ECKD reference I have not been able to verify any areas of the data 
> that are 
> common across device types. Thoughts?

There's SA22-7204-01 ("Common I/O Device Commands"), which is from 1992
(this is what I have on my disk -- is there anything newer?). It
specifies what bits 0-5 of byte 0 mean and states that bytes 1-31 are
optional and device-specific.

Maybe some other bits have been specified after 1992, but I have not
come across documentation for them.

Re: [Qemu-devel] [PATCH 5/5] qemu-nbd: Add --bitmap=NAME option

On 1/8/19 10:14 PM, Eric Blake wrote:
> Having to fire up qemu, then use QMP commands for nbd-server-start
> and nbd-server-add, just to expose a persistent dirty bitmap, is
> rather tedious.  Make it possible to expose a dirty bitmap using
> just qemu-nbd (of course, for now this only works when qemu-nbd is
> visiting a BDS formatted as qcow2).
> 
> For now, I play it safe and only allow a bitmap to be exposed on
> a read-only image.  We may relax it in the future.
> 
> Signed-off-by: Eric Blake 
> ---
>  qemu-nbd.texi |  5 +
>  qemu-nbd.c| 16 ++--
>  2 files changed, 19 insertions(+), 2 deletions(-)
> 

> @@ -1015,7 +1027,7 @@ int main(int argc, char **argv)
>  }
>  }
> 
> -exp = nbd_export_new(bs, dev_offset, fd_size, export_name, NULL, NULL,
> +exp = nbd_export_new(bs, dev_offset, fd_size, export_name, bitmap, NULL,
>   export_description, nbdflags, nbd_export_closed,
>   writethrough, NULL, &error_fatal);

The cover letter DID warn that this was only lightly tested.  For it to
even work, you have to s/bitmap, NULL/NULL, bitmap/ or else swap the
meaning of the --bitmap and --description command line options.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Internship idea: I2C passthrough

On Wed, 9 Jan 2019, Stefan Hajnoczi wrote:

Hi folks,
You may be interested in I2C passthrough support in QEMU. Paolo and I
brainstormed the idea and I've written up a rough project summary
below. It still needs some work to make this a solid internship
project idea.

Any thoughts? Would you like to co-mentor this summer?

=== I2C Passthrough ===

'''Summary:''' Implement I2C bus passthrough on Linux hosts so that
emulated Raspberry Pi or micro:bit boards can talk to real I2C
devices.

How about implementing USB for raspi boards instead? Or is there a project
proposal (or even some code somewhere out of tree) for that already? Seems
more useful to allow having emulated keyboard and mouse to use Raspbian
for example or pass through USB devices. (I don't know anything about this
was just reminded to a recent discussion by this message. If it's not
relevant just disregard my comment and sorry for the noise.)

Regards,
BALATON Zoltan

QEMU emulates I2C devices in software but currently cannot pass
through real I2C devices from the host to the guest. It would be
useful to access real I2C devices from inside the guest, for example
for developers writing and testing software under QEMU on their
computer.

The project consists of the following tasks:
* Implement -object i2c-bus-passthrough,adapter=N,id=my-i2c-bus
* Add i2c-bus-passthrough support to at least 1 existing emulated I2C controller
* Implement micro:bit TWI controller emulation on the nRF51 system-on-chip

This project will allow you to learn about the I2C bus and how to
write device emulation code in QEMU. You will enjoy it if you like
working with physical hardware.

'''Links:'''
* [https://en.wikipedia.org/wiki/I%C2%B2C I2C wikipedia page]
* [https://elinux.org/Interfacing_with_I2C_Devices Overview of Linux
I2C programming interfaces]
*
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/i2c/dev-interface
Linux I2C userspace interface documentation]

'''Details:'''
* Skill level: intermediate
* Language: C
* Mentor: Paolo Bonzini ("bonzini" on IRC),
Stefan Hajnoczi ("stefanha" on IRC)

Re: [Qemu-devel] [PATCH v3 3/5] libnvdimm: add nd_region buffered dax_dev flag

2019-01-09 Thread Pankaj Gupta



> >
> > This patch adds 'DAXDEV_BUFFERED' flag which is set
> > for virtio pmem corresponding nd_region. This later
> > is used to disable MAP_SYNC functionality for ext4
> > & xfs filesystem.
> >
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/dax/super.c  | 17 +
> >  drivers/nvdimm/pmem.c|  3 +++
> >  drivers/nvdimm/region_devs.c |  7 +++
> >  drivers/virtio/pmem.c|  1 +
> >  include/linux/dax.h  |  9 +
> >  include/linux/libnvdimm.h|  6 ++
> >  6 files changed, 43 insertions(+)
> >
> > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > index 6e928f3..9128740 100644
> > --- a/drivers/dax/super.c
> > +++ b/drivers/dax/super.c
> > @@ -167,6 +167,8 @@ enum dax_device_flags {
> > DAXDEV_ALIVE,
> > /* gate whether dax_flush() calls the low level flush routine */
> > DAXDEV_WRITE_CACHE,
> > +   /* flag to disable MAP_SYNC for virtio based host page cache flush
> > */
> > +   DAXDEV_BUFFERED,
> >  };
> >
> >  /**
> > @@ -335,6 +337,21 @@ bool dax_write_cache_enabled(struct dax_device
> > *dax_dev)
> >  }
> >  EXPORT_SYMBOL_GPL(dax_write_cache_enabled);
> >
> > +void virtio_pmem_host_cache(struct dax_device *dax_dev, bool wc)
> > +{
> > +   if (wc)
> > +   set_bit(DAXDEV_BUFFERED, &dax_dev->flags);
> > +   else
> > +   clear_bit(DAXDEV_BUFFERED, &dax_dev->flags);
> > +}
> > +EXPORT_SYMBOL_GPL(virtio_pmem_host_cache);
> 
> The "write_cache" property was structured this way because it can
> conceivably change at runtime. The MAP_SYNC capability should be
> static and never changed after init.

o.k. Will change.

> 
> > +bool virtio_pmem_host_cache_enabled(struct dax_device *dax_dev)
> > +{
> > +   return test_bit(DAXDEV_BUFFERED, &dax_dev->flags);
> > +}
> > +EXPORT_SYMBOL_GPL(virtio_pmem_host_cache_enabled);
> 
> Echoing Darrick and Jan this is should be a generic property of a
> dax_device and not specific to virtio. I don't like the "buffered"
> designation as that's not accurate. There may be hardware reasons why
> a dax_device is not synchronous, like a requirement to flush a
> write-pending queue or otherwise notify the device of new writes.

Agree.

> 
> I would just have a dax_synchronous() helper and a DAXDEV_SYNC flag. I
> would also modify alloc_dax() to take a flags argument so that the
> capability can be instantiated when the dax_device is allocated.

o.k. Will make the change.

Thanks,
Pankaj
>

Re: [Qemu-devel] [PATCH] file-posix: add rough-block-status parameter

2019-01-09 Thread Denis V . Lunev

On 1/9/19 8:09 PM, Kevin Wolf wrote:
> Am 09.01.2019 um 17:55 hat Paolo Bonzini geschrieben:
>> On 09/01/19 17:51, Kevin Wolf wrote:
>>> Am 09.01.2019 um 17:42 hat Paolo Bonzini geschrieben:
 On 09/01/19 12:23, Kevin Wolf wrote:
> Also note that this is only metadata preallocation; full preallocation
> will still return allocated for the protocol layer and so it will always
> be slow.
 Full preallocation these days can create images with preallocated but
 known-zero blocks, I think?
>>> That would defeat one of the main purposes of preallocation because it
>>> would still require COW and metadata updates on the first write.
>> Sorry I mean at the protocol level, like FALLOC_FL_ZERO_RANGE.  It would
>> still require metadata updates on the filesystem level, unlike "real"
>> full preallocation, but no qcow2 metadata updates.
> preallocation=full doesn't do that. preallocation=falloc is more like
> it, though that is just posix_fallocate(), not FALLOC_FL_ZERO_RANGE. But
> when called on a new file, it might result in the same thing? Not sure.
>
> Kevin
From the point of the file structure "fallocated" space should be considered
as data as the space on the disc is really allocated and the same constraint
should be kept in on target.

Den

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2019-01-09 Thread Daniel Henrique Barboza





On 1/9/19 12:21 PM, Kevin Wolf wrote:

Am 09.01.2019 um 15:10 hat Max Reitz geschrieben:

On 06.09.18 13:11, Daniel Henrique Barboza wrote:

changes in v2:
- removed the "RFC" marker;
- added a new patch (patch 2) that removes
bdrv_snapshot_delete_by_id_or_name from the code;
- made changes in patch 1 as suggested by Murilo;
- previous patch set link:
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html


It is not uncommon to see bugs being opened by testers that attempt to
create VM snapshots using HMP. It turns out that "0" and "1" are quite
common snapshot names and they trigger a lot of bugs. I gave an example
in the commit message of patch 1, but to sum up here: QEMU treats the
input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
is documented as such, but this can lead to strange situations.

Given that it is strange for an API to consider a parameter to be 2 fields
at the same time, and inadvently treating them as one or the other, and
that removing the ID field is too drastic, my idea here is to keep the
ID field for internal control, but do not let the user set it.

I guess there's room for discussion about considering this change an API
change or not. It doesn't affect users of HMP and it doesn't affect Libvirt,
but simplifying the meaning of the parameters of savevm/loadvm/delvm.

(Yes, very late reply, I'm sorry...)

I do think it affects users of HMP, because right now you can delete
snapshots with their ID, and after this series you cannot.

Can there be snapshots that can't be identified by a snapshot name, but
only by their ID?


Hmmm I would need to read my notes back when I was working on this series.
In a quick look at the code and playing around with HMP, what I can tell 
is that:


- there is no way for a snapshot to have an empty TAG - an auto-generated
name is generated and used as TAG.

-  if you create a snapshot using an existing TAG, the older snapshot is 
overwritten
without warning (would be nice to fix it to at least provide a warning 
message - I can

do it in this series if a respin is needed)


I do not have the expertise of the whole block layer to see further 
implications of
the changes made in this series, but as far as HMP goes, I don't think 
there's too much
to it other than what I've said in the cover letter. The goal here is to 
remove the

ambiguity of save/load/del vm commands that would interpret the argument
either as tag or ID, leading to situations in which snapshots ended up 
getting

overwritten because the tag matched an existing ID.


I am not against providing a warning message if the user tries to create 
a snapshot
using a numerical tag. In the end, although I said otherwise in the 
cover letter, changing
the meaning of an input is still an API change, and I'd rather error on 
the cautious side
and warn the callers about it. Even if this has no/little impact (please 
feel free to prove

me wrong) on them.



Daniel









I think we had a short discussion about just disallowing numeric
snapshot names.  How bad would that be?

It would be incompatible with existing images and result in a more
complex snapshot identifier resolution. Why would it be any better?

Kevin

Re: [Qemu-devel] [PATCH v3 2/2] configure: Force the C standard to gnu99

2019-01-09 Thread Philippe Mathieu-Daudé

On 1/9/19 5:39 PM, Thomas Huth wrote:
> Different versions of GCC and Clang use different versions of the C standard.
> This repeatedly caused problems already, e.g. with duplicated typedefs:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg05829.html
> 
> or with for-loop variable initializers:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg00237.html
> 
> To avoid these problems, we should enforce the C language version to the
> same level for all compilers. Since our minimum compiler versions is
> GCC v4.8, our best option is "gnu99" for C code right now ("gnu17" is not
> available there yet, and "gnu11" is marked as "experimental"), and "gnu++98"
> for the few C++ code that we have in the repository.
> 
> Signed-off-by: Thomas Huth 

Raising from a 30 years old stantard to a 20 years old one is a great
improvement (I used to be a defender of the C89, but I evolved and now
admit C99 has helpful features).

Using '-std=gnu++98' for g++ v4.8 looks like a good compromise to the
issues Daniel mentioned (still 'experimental').

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  v3: Compile C++ code with -std=gnu++98
> 
>  configure | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index b9f34af..448dbc8 100755
> --- a/configure
> +++ b/configure
> @@ -107,6 +107,9 @@ update_cxxflags() {
>  -Wstrict-prototypes|-Wmissing-prototypes|-Wnested-externs|\
>  -Wold-style-declaration|-Wold-style-definition|-Wredundant-decls)
>  ;;
> +-std=gnu99)
> +QEMU_CXXFLAGS=${QEMU_CXXFLAGS:+$QEMU_CXXFLAGS }"-std=gnu++98"
> +;;
>  *)
>  QEMU_CXXFLAGS=${QEMU_CXXFLAGS:+$QEMU_CXXFLAGS }$arg
>  ;;
> @@ -585,7 +588,7 @@ ARFLAGS="${ARFLAGS-rv}"
>  # left shift of signed integers is well defined and has the expected
>  # 2s-complement style results. (Both clang and gcc agree that it
>  # provides these semantics.)
> -QEMU_CFLAGS="-fno-strict-aliasing -fno-common -fwrapv $QEMU_CFLAGS"
> +QEMU_CFLAGS="-fno-strict-aliasing -fno-common -fwrapv -std=gnu99 
> $QEMU_CFLAGS"
>  QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
>  QEMU_CFLAGS="-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS"
>  QEMU_CFLAGS="-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
> $QEMU_CFLAGS"
>

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 1/6] smbus: Add a helper to generate SPD EEPROM data


On Wed, 9 Jan 2019, BALATON Zoltan wrote:

On Wed, 9 Jan 2019, Philippe Mathieu-Daudé wrote:

On 1/9/19 1:15 PM, BALATON Zoltan wrote:

On Wed, 9 Jan 2019, Philippe Mathieu-Daudé wrote:

On 1/3/19 5:27 PM, BALATON Zoltan wrote:

There are several boards with SPD EEPROMs that are now using
duplicated or slightly different hard coded data. Add a helper to
generate SPD data for a memory module of given type and size that
could be used by these boards (either as is or with further changes if
needed) which should help cleaning this up and avoid further
duplication.

Signed-off-by: BALATON Zoltan 
---
v3: Fixed a tab indent
v2: Added errp parameter to pass errors back to caller

?hw/i2c/smbus_eeprom.c? | 130
+
?include/hw/i2c/smbus.h |?? 3 ++
?2 files changed, 133 insertions(+)

diff --git a/hw/i2c/smbus_eeprom.c b/hw/i2c/smbus_eeprom.c
index f18aa3de35..bef24a1ca4 100644
--- a/hw/i2c/smbus_eeprom.c
+++ b/hw/i2c/smbus_eeprom.c

[...]

+
+??? spd = g_malloc0(256);


I think this buffer is eeprom-dependant, not SPD related.


This function is called spd_data_generate(). It specifically generates
SPD EEPROM data and nothing else. as you see below data is hardcoded so
would not work for other EEPROMs (the first two bytes even specify
EEPROM size, hence I don't think size needs to be passed as a parameter.


Well this is why worried me at first, because you alloc 256 bytes ...




Wouldn't it be cleaner to pass the (previously created) buffer as
argument such:

?/* return true on success */
?bool spd_data_fill(void *buf, size_t bufsize,
??? enum sdram_type type, ram_addr_t ram_size,
??? Error **errp);


It could take a previously created buffer but it's simpler this way and
one less error to handle by the caller so I don't like adding two more
parameters for this.


or return something else like ssize_t.


Again, the current version is simpler IMO so while this aims to be
generic to be used by other boards but still not completely generic for
all kinds of EEPROMs. Just for SPD EEPROMs commonly found on SDR, DDR
and DDR2 memory modules. Anything else (even DDR3) is too dissimilar so
those will need another function not this one.

Regards,
BALATON Zoltan




+??? spd[0] = 128;?? /* data bytes in EEPROM */


... for a 128 bytes EEPROM.


No, I allocate 256 bytes for a 256 bytes EEPROM of which the first 128 bytes 
are containing SPD data as described in for example:


https://en.wikipedia.org/wiki/Serial_presence_detect

This also matches the previous code that allocated 256 bytes (and still does, 
see smbus_eeprom_init() function just above this one).



Maybe we can find a compromise at a quick fix with:

 /* no other size currently supported */
 static const size_t spd_eeprom_size = 128;

 spd = g_malloc0(spd_eeprom_size);
 ...

 spd[0] = spd_eeprom_size;
 spd[1] = 1 + ctzl(spd_eeprom_size);


This does not match static SPD data currently in the code elsewhere which all 
start with 128, 8,... so I'm not sure some firmware would dislike a non-usual 
eeprom with 128, 4. My intention was to remove static SPD data that's present 
elsewhere and replace it with nearly identical data generated by this 
function. The data also has to match what's normally found on real hardware 
so maybe try dumping data from some memory modules and check what they have 
and if your suggestion is common then we could go with that otherwise I'd 
rather waste 128 bytes (or half a kilobyte for 4 modules) than get 
compatibility problems due to presenting unusual data to firmwares and other 
guest software that parse SPD data.


Unless someone else also thinks it's a good idea to go with unusual SPD data 
to save a few bytes.


Even then it would not work. Whole smbus_eeprom.c seems to have EEPROM 
size == 256 hardcoded all over the place, so...



Regards,
BALATON Zoltan


+??? spd[1] = 8; /* log2 size of EEPROM */
+??? spd[2] = type;
+??? spd[3] = 13;??? /* row address bits */
+??? spd[4] = 10;??? /* column address bits */
+??? spd[5] = (type == DDR2 ? nbanks - 1 : nbanks);
+??? spd[6] = 64;??? /* module data width */
+??? /* reserved / data width high */
+??? spd[8] = 4; /* interface voltage level */
+??? spd[9] = 0x25;? /* highest CAS latency */
+??? spd[10] = 1;??? /* access time */
+??? /* DIMM configuration 0 = non-ECC */
+??? spd[12] = 0x82; /* refresh requirements */
+??? spd[13] = 8;??? /* primary SDRAM width */
+??? /* ECC SDRAM width */
+??? spd[15] = (type == DDR2 ? 0 : 1); /* reserved / delay for random
col rd */
+??? spd[16] = 12;?? /* burst lengths supported */
+??? spd[17] = 4;??? /* banks per SDRAM device */
+??? spd[18] = 12;?? /* ~CAS latencies supported */
+??? spd[19] = (type == DDR2 ? 0 : 1); /* reserved / ~CS latencies
supported */
+??? spd[20] = 2;??? /* DIMM type / ~WE latencies */
+??? /* module features */
+??? /* memory chip features */
+??? spd[23] =

Re: [Qemu-devel] [PATCH v3 5/5] xfs: disable map_sync for virtio pmem

2019-01-09 Thread Pankaj Gupta



> > Virtio pmem provides asynchronous host page cache flush
> > mechanism. we don't support 'MAP_SYNC' with virtio pmem
> > and xfs.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  fs/xfs/xfs_file.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index e474250..eae4aa4 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1190,6 +1190,14 @@ xfs_file_mmap(
> > if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
> > return -EOPNOTSUPP;
> >  
> > +   /* We don't support synchronous mappings with guest direct access
> > +* and virtio based host page cache mechanism.
> > +*/
> > +   if (IS_DAX(file_inode(filp)) && virtio_pmem_host_cache_enabled(
> 
> Echoing what Jan said, this ought to be some sort of generic function
> that tells us whether or not memory mapped from the dax device will
> always still be accessible even after a crash (i.e. supports MAP_SYNC).

o.k
> 
> What if the underlying file on the host is itself on pmem and can be
> MAP_SYNC'd?  Shouldn't the guest be able to use MAP_SYNC as well?

Guest MAP_SYNC on actual host pmem will sync guest metadata, as guest 
writes are persistent on actual pmem. Host side Qemu MAP_SYNC enabling
work for pmem is in progress. It will make sure metadata at host side
is in consistent state after a crash or any other metadata corruption 
operation.

For virtio-pmem, we are emulating pmem device over regular storage on
host side. Guest need to call fsync followed by write to make sure
guest metadata is in consistent state(or journalled). Host backing file
data & metadata will also be persistent after guest to host fsync.

Thanks,
Pankaj

Re: [Qemu-devel] [qemu-s390x] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2019-01-09 Thread Jason J. Herne


On 1/7/19 2:02 PM, Jason J. Herne wrote:

+
  /*
   * sense-id response buffer layout
   */
@@ -205,6 +265,61 @@ typedef struct senseid {
  struct ciw ciw[62];
  }  __attribute__ ((packed, aligned(4))) SenseId;
+/* architected values for first sense byte */
+#define SNS0_CMD_REJECT 0x80
+#define SNS0_INTERVENTION_REQ   0x40
+#define SNS0_BUS_OUT_CHECK  0x20
+#define SNS0_EQUIPMENT_CHECK0x10
+#define SNS0_DATA_CHECK 0x08
+#define SNS0_OVERRUN0x04
+#define SNS0_INCOMPL_DOMAIN 0x01


IIRC, only byte 0 is device independent, and the others below are
(ECKD) dasd specific?


+
+/* architectured values for second sense byte */
+#define SNS1_PERM_ERR   0x80
+#define SNS1_INV_TRACK_FORMAT   0x40
+#define SNS1_EOC0x20
+#define SNS1_MESSAGE_TO_OPER0x10
+#define SNS1_NO_REC_FOUND   0x08
+#define SNS1_FILE_PROTECTED 0x04
+#define SNS1_WRITE_INHIBITED0x02
+#define SNS1_INPRECISE_END  0x01
+
+/* architectured values for third sense byte */
+#define SNS2_REQ_INH_WRITE  0x80
+#define SNS2_CORRECTABLE0x40
+#define SNS2_FIRST_LOG_ERR  0x20
+#define SNS2_ENV_DATA_PRESENT   0x10
+#define SNS2_INPRECISE_END  0x04
+
+/* 24-byte Sense fmt/msg codes */
+#define SENSE24_FMT_PROG_SYS0x0
+#define SENSE24_FMT_EQUIPMENT   0x2
+#define SENSE24_FMT_CONTROLLER  0x3
+#define SENSE24_FMT_MISC0xF
+
+#define SENSE24_FMT0_MSG_RESET_NOTIFICATION 0x16
+
+/* basic sense response buffer layout */
+typedef struct senseData {
+uint8_t status[3];
+uint8_t res_count;
+uint8_t phys_drive_id;
+uint8_t low_cyl_addr;
+uint8_t head_high_cyl_addr;
+uint8_t fmt_msg;
+uint64_t fmt_dependent_info[2];
+uint8_t reserved;
+uint8_t program_action_code;
+uint16_t config_info;
+uint8_t mcode_hicyl;
+uint8_t cyl_head_addr[3];
+}  __attribute__ ((packed, aligned(4))) SenseData;


And this looks _really_ dasd specific.



Yep, I glossed over those details while I was furiously tracking down the reset bug. I'll 
take a look at redesigning this.


All of my information for creating these data structures came from an internal ECKD DASD 
reference. There are probably some things that could stand a bit of cleanup or renaming. 
Aside from that, considering this is in a DASD only (ECKD DASD only at the moment) code 
path are you okay with my renaming the struct to senseDataECKD or something similar?
I'm not sure what value there is in abstracting sense at the moment. I'm not even sure 
what other device's sense data looks like. Since my description of the SENSE CCW comes 
from an ECKD reference I have not been able to verify any areas of the data that are 
common across device types. Thoughts?



--
-- Jason J. Herne (jjhe...@linux.ibm.com)

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 1/6] smbus: Add a helper to generate SPD EEPROM data

2019-01-09 Thread Philippe Mathieu-Daudé

On 1/9/19 7:05 PM, BALATON Zoltan wrote:
> On Wed, 9 Jan 2019, BALATON Zoltan wrote:
>> On Wed, 9 Jan 2019, Philippe Mathieu-Daudé wrote:
>>> On 1/9/19 1:15 PM, BALATON Zoltan wrote:
 On Wed, 9 Jan 2019, Philippe Mathieu-Daudé wrote:
> On 1/3/19 5:27 PM, BALATON Zoltan wrote:
>> There are several boards with SPD EEPROMs that are now using
>> duplicated or slightly different hard coded data. Add a helper to
>> generate SPD data for a memory module of given type and size that
>> could be used by these boards (either as is or with further
>> changes if
>> needed) which should help cleaning this up and avoid further
>> duplication.
>>
>> Signed-off-by: BALATON Zoltan 
>> ---
>> v3: Fixed a tab indent
>> v2: Added errp parameter to pass errors back to caller
>>
>> ?hw/i2c/smbus_eeprom.c? | 130
>> +
>> ?include/hw/i2c/smbus.h |?? 3 ++
>> ?2 files changed, 133 insertions(+)
>>
>> diff --git a/hw/i2c/smbus_eeprom.c b/hw/i2c/smbus_eeprom.c
>> index f18aa3de35..bef24a1ca4 100644
>> --- a/hw/i2c/smbus_eeprom.c
>> +++ b/hw/i2c/smbus_eeprom.c
 [...]
>> +
>> +??? spd = g_malloc0(256);
>
> I think this buffer is eeprom-dependant, not SPD related.

 This function is called spd_data_generate(). It specifically generates
 SPD EEPROM data and nothing else. as you see below data is hardcoded so
 would not work for other EEPROMs (the first two bytes even specify
 EEPROM size, hence I don't think size needs to be passed as a
 parameter.
>>>
>>> Well this is why worried me at first, because you alloc 256 bytes ...
>>>

> Wouldn't it be cleaner to pass the (previously created) buffer as
> argument such:
>
> ?/* return true on success */
> ?bool spd_data_fill(void *buf, size_t bufsize,
> ??? enum sdram_type type, ram_addr_t ram_size,
> ??? Error **errp);

 It could take a previously created buffer but it's simpler this way and
 one less error to handle by the caller so I don't like adding two more
 parameters for this.

> or return something else like ssize_t.

 Again, the current version is simpler IMO so while this aims to be
 generic to be used by other boards but still not completely generic for
 all kinds of EEPROMs. Just for SPD EEPROMs commonly found on SDR, DDR
 and DDR2 memory modules. Anything else (even DDR3) is too dissimilar so
 those will need another function not this one.

 Regards,
 BALATON Zoltan

>
>> +??? spd[0] = 128;?? /* data bytes in EEPROM */
>>>
>>> ... for a 128 bytes EEPROM.
>>
>> No, I allocate 256 bytes for a 256 bytes EEPROM of which the first 128
>> bytes are containing SPD data as described in for example:
>>
>> https://en.wikipedia.org/wiki/Serial_presence_detect
>>
>> This also matches the previous code that allocated 256 bytes (and
>> still does, see smbus_eeprom_init() function just above this one).
>>
>>> Maybe we can find a compromise at a quick fix with:
>>>
>>>  /* no other size currently supported */
>>>  static const size_t spd_eeprom_size = 128;
>>>
>>>  spd = g_malloc0(spd_eeprom_size);
>>>  ...
>>>
>>>  spd[0] = spd_eeprom_size;
>>>  spd[1] = 1 + ctzl(spd_eeprom_size);
>>
>> This does not match static SPD data currently in the code elsewhere
>> which all start with 128, 8,... so I'm not sure some firmware would
>> dislike a non-usual eeprom with 128, 4. My intention was to remove
>> static SPD data that's present elsewhere and replace it with nearly
>> identical data generated by this function. The data also has to match
>> what's normally found on real hardware so maybe try dumping data from
>> some memory modules and check what they have and if your suggestion is
>> common then we could go with that otherwise I'd rather waste 128 bytes
>> (or half a kilobyte for 4 modules) than get compatibility problems due
>> to presenting unusual data to firmwares and other guest software that
>> parse SPD data.
>>
>> Unless someone else also thinks it's a good idea to go with unusual
>> SPD data to save a few bytes.
> 
> Even then it would not work. Whole smbus_eeprom.c seems to have EEPROM
> size == 256 hardcoded all over the place, so...

Yes, this 'device' needs love^H^H^H^Hcleanup.

Thanks for the info you provided.

Regards,

Phil.

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

On 1/9/19 11:38 AM, Max Reitz wrote:

> I do think it affects users of HMP, because right now you can delete
> snapshots with their ID, and after this series you cannot.

>> This. Is. HMP.
>>
>> Not a stable ABI, no deprecation period of two releases.
> 
> Well, if you want to do it.
> 
> This may be HMP, but this is also the only interface to savevm, so it's
> not like users have a choice to use a more stable interface.  I know
> that was a conscious decision, more or less, but I don't see why we need
> to be so nasty when the hardest thing about doing a nice deprecation
> would be to remember to make it an error in half a year.

Indeed, and libvirt IS using 'savevm' via HMP via QMP's
human-monitor-command, since there is no QMP counterpart for internal
snapshot.  Even though lately we consistently tell people that internal
snapshots are underdeveloped and you should use external snapshots, it
does not get away from the fact that libvirt has been using 'savevm' to
drive internal snapshots for years now, and that we MUST consider
back-compat and/or add an introspectible QMP interface before making
changes that would break libvirt.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

On 1/9/19 11:38 AM, Max Reitz wrote:

> 
> 
> Actually, to me what you're saying sounds more like "Our deprecation
> policy is useless" to which I wholeheartedly agree.  I think we should
> only remove things in major releases, and only if it was deprecated in
> the previous major release already.  (So if you deprecate something in
> 4.0, you can remove it in 5.0; but if you deprecate it in 4.1, you can
> remove it only in 6.0.)  From a user standpoint I really think we
> deprecate stuff too irregularly.
> 

That's actually incorrect. Our current version numbering scheme is that
the major version number is NOT synonymous with major releases: we just
bump the major version number once per year, and ALL releases are on
equal footing with no one release being more major than others.  Thus, a
policy that (at least) 2 releases is needed for a deprecation is
consistent, where one that requires waiting for a bump in the major
version number (which is as short as one release and as long as 3, given
that we bump every year with about 3 releases per year) is the one that
is less predictable and less meaningful (why is waiting for January
better than waiting for 2 releases?).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

2019-01-09 Thread Greg Kurz

On Wed, 9 Jan 2019 18:40:48 +0100
Cédric Le Goater  wrote:

> On 1/9/19 6:28 PM, Daniel P. Berrangé wrote:
> > On Wed, Jan 09, 2019 at 06:13:01PM +0100, Cédric Le Goater wrote:  
> >> On 1/9/19 5:39 PM, Thomas Huth wrote:  
> >>> When compiling with Clang and -std=gnu99, I get the following errors:
> >>>
> >>>   CC  ppc64-softmmu/hw/intc/xics_spapr.o
> >>> In file included from hw/intc/xics_spapr.c:34:
> >>> include/hw/ppc/xics.h:46:25: error: redefinition of typedef 'ICSState' is 
> >>> a C11 feature
> >>>   [-Werror,-Wtypedef-redefinition]
> >>> typedef struct ICSState ICSState;
> >>> ^
> >>> include/hw/ppc/spapr.h:18:25: note: previous definition is here
> >>> typedef struct ICSState ICSState;
> >>> ^
> >>> In file included from hw/intc/xics_spapr.c:34:
> >>> include/hw/ppc/xics.h:203:34: error: redefinition of typedef 
> >>> 'sPAPRMachineState' is a C11 feature
> >>>   [-Werror,-Wtypedef-redefinition]
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>>   CC  ppc64-softmmu/hw/intc/spapr_xive.o
> >>> In file included from hw/intc/spapr_xive.c:19:
> >>> include/hw/ppc/spapr_xive.h:38:3: error: redefinition of typedef 
> >>> 'sPAPRXive' is a C11 feature
> >>>   [-Werror,-Wtypedef-redefinition]
> >>> } sPAPRXive;
> >>>   ^
> >>> include/hw/ppc/spapr.h:20:26: note: previous definition is here
> >>> typedef struct sPAPRXive sPAPRXive;
> >>>  ^
> >>> In file included from hw/intc/spapr_xive.c:19:
> >>> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> >>> 'sPAPRMachineState' is a C11 feature
> >>>   [-Werror,-Wtypedef-redefinition]
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>>   CC  ppc64-softmmu/hw/char/spapr_vty.o
> >>> In file included from hw/char/spapr_vty.c:8:
> >>> In file included from include/hw/ppc/spapr.h:12:
> >>> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> >>> 'sPAPRMachineState' is a C11 feature
> >>>   [-Werror,-Wtypedef-redefinition]
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> >>> typedef struct sPAPRMachineState sPAPRMachineState;
> >>>  ^
> >>>
> >>> Since we're going to make -std=gnu99 mandatory, fix these issues
> >>> by including the right header files indead of typedeffing stuff twice.
> >>>
> >>> Signed-off-by: Thomas Huth 
> >>> ---
> >>>  include/hw/ppc/spapr.h  | 4 ++--
> >>>  include/hw/ppc/spapr_xive.h | 2 +-
> >>>  include/hw/ppc/xics.h   | 4 ++--
> >>>  3 files changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >>> index 2c77a8b..6a5ae4f 100644
> >>> --- a/include/hw/ppc/spapr.h
> >>> +++ b/include/hw/ppc/spapr.h
> >>> @@ -8,6 +8,8 @@
> >>>  #include "hw/mem/pc-dimm.h"
> >>>  #include "hw/ppc/spapr_ovec.h"
> >>>  #include "hw/ppc/spapr_irq.h"
> >>> +#include "hw/ppc/xics.h"
> >>> +#include "hw/ppc/spapr_xive.h"
> >>>  
> >>>  struct VIOsPAPRBus;
> >>>  struct sPAPRPHBState;
> >>> @@ -15,8 +17,6 @@ struct sPAPRNVRAM;
> >>>  typedef struct sPAPREventLogEntry sPAPREventLogEntry;
> >>>  typedef struct sPAPREventSource sPAPREventSource;
> >>>  typedef struct sPAPRPendingHPT sPAPRPendingHPT;
> >>> -typedef struct ICSState ICSState;
> >>> -typedef struct sPAPRXive sPAPRXive;
> >>>  
> >>>  #define HPTE64_V_HPTE_DIRTY 0x0040ULL
> >>>  #define SPAPR_ENTRY_POINT   0x100
> >>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >>> index 728735d..aff4366 100644
> >>> --- a/include/hw/ppc/spapr_xive.h
> >>> +++ b/include/hw/ppc/spapr_xive.h
> >>> @@ -42,7 +42,7 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t 
> >>> lisn);
> >>>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
> >>>  qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
> >>>  
> >>> -typedef struct sPAPRMachineState sPAPRMachineState;
> >>> +#include "hw/ppc/spapr.h"  /* for sPAPRMachineState */  
> >>
> >> so both files include each other, how nice ...  
> > 
> > If the header files are mutually dependent it makes me wonder what the
> > point of having them split up is ?  
> 
> The XICS interrupt controller models are used in two different machines,
> sPAPR and PowerNV.
> 
> > Feels like either they need to be merged, or they need to be split up
> > and refactored even more to remove the mutual dependancy.  
> 
> yes or that some spap

Re: [Qemu-devel] [Spice-devel] [PATCH spice 2/3] QXL interface: deprecate spice_qxl_set_max_monitors

2019-01-09 Thread Jonathon Jongsma

Acked-by: Jonathon Jongsma 


On Tue, 2019-01-08 at 16:26 +0100, Lukáš Hrázký wrote:
> Replace it by spice_qxl_set_device_info. Note we can't use
> monitors_count for what's stored in max_monitors, because
> monitors_count
> denotes the length of the device_display_ids array, which
> spice_qxl_set_max_monitors doesn't touch.
> 
> Signed-off-by: Lukáš Hrázký 
> ---
>  server/red-qxl.c   | 1 +
>  server/spice-qxl.h | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/server/red-qxl.c b/server/red-qxl.c
> index 0ea424cd..6ffd8286 100644
> --- a/server/red-qxl.c
> +++ b/server/red-qxl.c
> @@ -888,6 +888,7 @@ void spice_qxl_set_device_info(QXLInstance
> *instance,
>  }
>  
>  instance->st->monitors_count = device_display_id_count;
> +instance->st->max_monitors = device_display_id_count;
>  }
>  
>  void red_qxl_init(RedsState *reds, QXLInstance *qxl)
> diff --git a/server/spice-qxl.h b/server/spice-qxl.h
> index 547d3d93..e7af5e5e 100644
> --- a/server/spice-qxl.h
> +++ b/server/spice-qxl.h
> @@ -101,9 +101,9 @@ void spice_qxl_monitors_config_async(QXLInstance
> *instance, QXLPHYSICAL monitors
>   int group_id, uint64_t cookie);
>  /* since spice 0.12.3 */
>  void spice_qxl_driver_unload(QXLInstance *instance);
> -/* since spice 0.12.6 */
> +/* since spice 0.12.6, deprecated since 0.14.2,
> spice_qxl_set_device_info replaces it */
>  void spice_qxl_set_max_monitors(QXLInstance *instance,
> -unsigned int max_monitors);
> +unsigned int max_monitors)
> SPICE_GNUC_DEPRECATED;
>  /* since spice 0.13.1 */
>  void spice_qxl_gl_scanout(QXLInstance *instance,
>int fd,

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

2019-01-09 Thread Cédric Le Goater

On 1/9/19 5:39 PM, Thomas Huth wrote:
> When compiling with Clang and -std=gnu99, I get the following errors:
> 
>   CC  ppc64-softmmu/hw/intc/xics_spapr.o
> In file included from hw/intc/xics_spapr.c:34:
> include/hw/ppc/xics.h:46:25: error: redefinition of typedef 'ICSState' is a 
> C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct ICSState ICSState;
> ^
> include/hw/ppc/spapr.h:18:25: note: previous definition is here
> typedef struct ICSState ICSState;
> ^
> In file included from hw/intc/xics_spapr.c:34:
> include/hw/ppc/xics.h:203:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
>   CC  ppc64-softmmu/hw/intc/spapr_xive.o
> In file included from hw/intc/spapr_xive.c:19:
> include/hw/ppc/spapr_xive.h:38:3: error: redefinition of typedef 'sPAPRXive' 
> is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> } sPAPRXive;
>   ^
> include/hw/ppc/spapr.h:20:26: note: previous definition is here
> typedef struct sPAPRXive sPAPRXive;
>  ^
> In file included from hw/intc/spapr_xive.c:19:
> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
>   CC  ppc64-softmmu/hw/char/spapr_vty.o
> In file included from hw/char/spapr_vty.c:8:
> In file included from include/hw/ppc/spapr.h:12:
> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
> 'sPAPRMachineState' is a C11 feature
>   [-Werror,-Wtypedef-redefinition]
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
> typedef struct sPAPRMachineState sPAPRMachineState;
>  ^
> 
> Since we're going to make -std=gnu99 mandatory, fix these issues
> by including the right header files indead of typedeffing stuff twice.
> 
> Signed-off-by: Thomas Huth 
> ---
>  include/hw/ppc/spapr.h  | 4 ++--
>  include/hw/ppc/spapr_xive.h | 2 +-
>  include/hw/ppc/xics.h   | 4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 2c77a8b..6a5ae4f 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -8,6 +8,8 @@
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/ppc/spapr_ovec.h"
>  #include "hw/ppc/spapr_irq.h"
> +#include "hw/ppc/xics.h"
> +#include "hw/ppc/spapr_xive.h"
>  
>  struct VIOsPAPRBus;
>  struct sPAPRPHBState;
> @@ -15,8 +17,6 @@ struct sPAPRNVRAM;
>  typedef struct sPAPREventLogEntry sPAPREventLogEntry;
>  typedef struct sPAPREventSource sPAPREventSource;
>  typedef struct sPAPRPendingHPT sPAPRPendingHPT;
> -typedef struct ICSState ICSState;
> -typedef struct sPAPRXive sPAPRXive;
>  
>  #define HPTE64_V_HPTE_DIRTY 0x0040ULL
>  #define SPAPR_ENTRY_POINT   0x100
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 728735d..aff4366 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -42,7 +42,7 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>  qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>  
> -typedef struct sPAPRMachineState sPAPRMachineState;
> +#include "hw/ppc/spapr.h"  /* for sPAPRMachineState */
>  
>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>  void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 14afda1..a7f49a4 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -47,6 +47,8 @@ typedef struct ICSState ICSState;
>  typedef struct ICSIRQState ICSIRQState;
>  typedef struct XICSFabric XICSFabric;
>  
> +#include "hw/ppc/spapr.h"
> +
>  #define TYPE_ICP "icp"
>  #define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP)
>  
> @@ -200,8 +202,6 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon);
>  void ics_resend(ICSState *ics);
>  void icp_resend(ICPState *ss);

I think the definitions below belong to another header file, 
may be "xics_spapr.h". That would be cleaner.

Thanks,

C. 

> -typedef struct sPAPRMachineState sPAPRMachineState;
> -
>  void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> uint32_t phandle);
>  in

Re: [Qemu-devel] [Spice-devel] [PATCH spice 1/3] QXL interface: add a function to identify monitors in the guest

2019-01-09 Thread Jonathon Jongsma

On Tue, 2019-01-08 at 16:26 +0100, Lukáš Hrázký wrote:
> Adds a function to let QEMU provide information to identify graphics
> devices and their monitors in the guest. The function
> (spice_qxl_set_device_info) sets the device address (e.g. a PCI path)
> and monitor ID -> device display ID mapping of displays exposed by
> given
> QXL interface.


In my previous review, I asked for a slightly more explicit explanation
of the phrase "monitor ID" in the commit log. In this case, "monitor
ID" is an ID specific to the QXL instance starting at 0 and represents
the displays that are associated with that particular QXL instance.

Perhaps it makes sense to rename this "Qxl monitor ID"?

> 
> Signed-off-by: Lukáš Hrázký 
> ---
>  server/red-qxl.c | 44 ++
>  server/spice-qxl.h   | 46
> 
>  server/spice-server.syms |  5 +
>  3 files changed, 95 insertions(+)
> 
> diff --git a/server/red-qxl.c b/server/red-qxl.c
> index 97940611..0ea424cd 100644
> --- a/server/red-qxl.c
> +++ b/server/red-qxl.c
> @@ -41,6 +41,9 @@
>  #include "red-qxl.h"
>  
>  
> +#define MAX_DEVICE_ADDRESS_LEN 256
> +#define MAX_MONITORS_COUNT 16
> +
>  struct QXLState {
>  QXLWorker qxl_worker;
>  QXLInstance *qxl;
> @@ -53,6 +56,9 @@ struct QXLState {
>  unsigned int max_monitors;
>  RedsState *reds;
>  RedWorker *worker;
> +char device_address[MAX_DEVICE_ADDRESS_LEN];
> +uint32_t device_display_ids[MAX_MONITORS_COUNT];
> +size_t monitors_count;  // length of ^^^
>  
>  pthread_mutex_t scanout_mutex;
>  SpiceMsgDisplayGlScanoutUnix scanout;
> @@ -846,6 +852,44 @@ void red_qxl_gl_draw_async_complete(QXLInstance
> *qxl)
>  red_qxl_async_complete(qxl, cookie);
>  }
>  
> +SPICE_GNUC_VISIBLE
> +void spice_qxl_set_device_info(QXLInstance *instance,
> +   const char *device_address,
> +   uint32_t device_display_id_start,
> +   uint32_t device_display_id_count)
> +{
> +g_return_if_fail(device_address != NULL);

Just a thought: what if qemu calls this function twice for the same
instance. In theory this shouldn't happen, but should we be defensive
and handle it somehow? Print a warning? Just overwrite the previous
mapping? Right now it looks like we would just overwrite.

> +
> +size_t da_len = strnlen(device_address, MAX_DEVICE_ADDRESS_LEN);
> +if (da_len >= MAX_DEVICE_ADDRESS_LEN) {
> +spice_error("Device address too long: %lu > %u", da_len,
> MAX_DEVICE_ADDRESS_LEN);

Technically, I think the format string for a size_t variable is %zu,
not %lu

> +return;
> +}
> +
> +if (device_display_id_count > MAX_MONITORS_COUNT) {
> +spice_error("Device display ID count (%u) is greater than
> limit %u",
> +device_display_id_count,
> +MAX_MONITORS_COUNT);
> +return;
> +}
> +
> +strncpy(instance->st->device_address, device_address,
> MAX_DEVICE_ADDRESS_LEN);
> +
> +g_debug("QXL Instance %d setting device address \"%s\" and
> monitor -> device display mapping:",
> +instance->id,
> +device_address);
> +
> +// store the mapping monitor_id -> device_display_id
> +for (uint32_t monitor_id = 0; monitor_id <
> device_display_id_count; ++monitor_id) {
> +uint32_t device_display_id = device_display_id_start +
> monitor_id;
> +instance->st->device_display_ids[monitor_id] =
> device_display_id;
> +g_debug("   monitor ID %u -> device display ID %u",
> +monitor_id, device_display_id);
> +}
> +
> +instance->st->monitors_count = device_display_id_count;
> +}
> +
>  void red_qxl_init(RedsState *reds, QXLInstance *qxl)
>  {
>  QXLState *qxl_state;
> diff --git a/server/spice-qxl.h b/server/spice-qxl.h
> index 0c4e75fc..547d3d93 100644
> --- a/server/spice-qxl.h
> +++ b/server/spice-qxl.h
> @@ -115,6 +115,52 @@ void spice_qxl_gl_draw_async(QXLInstance
> *instance,
>   uint32_t w, uint32_t h,
>   uint64_t cookie);
>  
> +/* since spice 0.14.2 */
> +
> +/**
> + * spice_qxl_set_device_info:
> + * @instance the QXL instance to set the path to
> + * @device_address the path of the device this QXL instance belongs
> to
> + * @device_display_id_start the starting device display ID of this
> interface,
> + *  i.e. the one monitor ID 0 maps to

perhaps reword this last line:
i.e. the device display ID of monitor ID 0

> + * @device_display_id_count the total number of device display IDs
> on this
> + *  interface
> + *
> + * Sets the device information for this QXL interface, i.e. the
> hardware
> + * address (e.g. PCI) of the graphics device and the IDs of the
> displays of the
> + * graphics device that are exposed by this interface (device
> display IDs).
> + *
> + * The supported device

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] include/hw/ppc: Fix compilation with clang -std=gnu99

2019-01-09 Thread Cédric Le Goater

On 1/9/19 6:28 PM, Daniel P. Berrangé wrote:
> On Wed, Jan 09, 2019 at 06:13:01PM +0100, Cédric Le Goater wrote:
>> On 1/9/19 5:39 PM, Thomas Huth wrote:
>>> When compiling with Clang and -std=gnu99, I get the following errors:
>>>
>>>   CC  ppc64-softmmu/hw/intc/xics_spapr.o
>>> In file included from hw/intc/xics_spapr.c:34:
>>> include/hw/ppc/xics.h:46:25: error: redefinition of typedef 'ICSState' is a 
>>> C11 feature
>>>   [-Werror,-Wtypedef-redefinition]
>>> typedef struct ICSState ICSState;
>>> ^
>>> include/hw/ppc/spapr.h:18:25: note: previous definition is here
>>> typedef struct ICSState ICSState;
>>> ^
>>> In file included from hw/intc/xics_spapr.c:34:
>>> include/hw/ppc/xics.h:203:34: error: redefinition of typedef 
>>> 'sPAPRMachineState' is a C11 feature
>>>   [-Werror,-Wtypedef-redefinition]
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>>   CC  ppc64-softmmu/hw/intc/spapr_xive.o
>>> In file included from hw/intc/spapr_xive.c:19:
>>> include/hw/ppc/spapr_xive.h:38:3: error: redefinition of typedef 
>>> 'sPAPRXive' is a C11 feature
>>>   [-Werror,-Wtypedef-redefinition]
>>> } sPAPRXive;
>>>   ^
>>> include/hw/ppc/spapr.h:20:26: note: previous definition is here
>>> typedef struct sPAPRXive sPAPRXive;
>>>  ^
>>> In file included from hw/intc/spapr_xive.c:19:
>>> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
>>> 'sPAPRMachineState' is a C11 feature
>>>   [-Werror,-Wtypedef-redefinition]
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>>   CC  ppc64-softmmu/hw/char/spapr_vty.o
>>> In file included from hw/char/spapr_vty.c:8:
>>> In file included from include/hw/ppc/spapr.h:12:
>>> include/hw/ppc/spapr_xive.h:45:34: error: redefinition of typedef 
>>> 'sPAPRMachineState' is a C11 feature
>>>   [-Werror,-Wtypedef-redefinition]
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>> include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here
>>> typedef struct sPAPRMachineState sPAPRMachineState;
>>>  ^
>>>
>>> Since we're going to make -std=gnu99 mandatory, fix these issues
>>> by including the right header files indead of typedeffing stuff twice.
>>>
>>> Signed-off-by: Thomas Huth 
>>> ---
>>>  include/hw/ppc/spapr.h  | 4 ++--
>>>  include/hw/ppc/spapr_xive.h | 2 +-
>>>  include/hw/ppc/xics.h   | 4 ++--
>>>  3 files changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>> index 2c77a8b..6a5ae4f 100644
>>> --- a/include/hw/ppc/spapr.h
>>> +++ b/include/hw/ppc/spapr.h
>>> @@ -8,6 +8,8 @@
>>>  #include "hw/mem/pc-dimm.h"
>>>  #include "hw/ppc/spapr_ovec.h"
>>>  #include "hw/ppc/spapr_irq.h"
>>> +#include "hw/ppc/xics.h"
>>> +#include "hw/ppc/spapr_xive.h"
>>>  
>>>  struct VIOsPAPRBus;
>>>  struct sPAPRPHBState;
>>> @@ -15,8 +17,6 @@ struct sPAPRNVRAM;
>>>  typedef struct sPAPREventLogEntry sPAPREventLogEntry;
>>>  typedef struct sPAPREventSource sPAPREventSource;
>>>  typedef struct sPAPRPendingHPT sPAPRPendingHPT;
>>> -typedef struct ICSState ICSState;
>>> -typedef struct sPAPRXive sPAPRXive;
>>>  
>>>  #define HPTE64_V_HPTE_DIRTY 0x0040ULL
>>>  #define SPAPR_ENTRY_POINT   0x100
>>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>>> index 728735d..aff4366 100644
>>> --- a/include/hw/ppc/spapr_xive.h
>>> +++ b/include/hw/ppc/spapr_xive.h
>>> @@ -42,7 +42,7 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
>>>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>>>  qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>>>  
>>> -typedef struct sPAPRMachineState sPAPRMachineState;
>>> +#include "hw/ppc/spapr.h"  /* for sPAPRMachineState */
>>
>> so both files include each other, how nice ...
> 
> If the header files are mutually dependent it makes me wonder what the
> point of having them split up is ?

The XICS interrupt controller models are used in two different machines,
sPAPR and PowerNV.

> Feels like either they need to be merged, or they need to be split up
> and refactored even more to remove the mutual dependancy.

yes or that some spapr_* definitions do not belong to the common xics.h
file.
 
Thanks,

C.

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2019-01-09 Thread Max Reitz

On 09.01.19 18:20, Kevin Wolf wrote:
> Am 09.01.2019 um 18:05 hat Max Reitz geschrieben:
>> On 09.01.19 17:57, Daniel Henrique Barboza wrote:
>>>
>>>
>>> On 1/9/19 12:10 PM, Max Reitz wrote:
 On 06.09.18 13:11, Daniel Henrique Barboza wrote:
> changes in v2:
> - removed the "RFC" marker;
> - added a new patch (patch 2) that removes
> bdrv_snapshot_delete_by_id_or_name from the code;
> - made changes in patch 1 as suggested by Murilo;
> - previous patch set link:
> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
>
>
> It is not uncommon to see bugs being opened by testers that attempt to
> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> common snapshot names and they trigger a lot of bugs. I gave an example
> in the commit message of patch 1, but to sum up here: QEMU treats the
> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
> is documented as such, but this can lead to strange situations.
>
> Given that it is strange for an API to consider a parameter to be 2
> fields
> at the same time, and inadvently treating them as one or the other, and
> that removing the ID field is too drastic, my idea here is to keep the
> ID field for internal control, but do not let the user set it.
>
> I guess there's room for discussion about considering this change an API
> change or not. It doesn't affect users of HMP and it doesn't affect
> Libvirt,
> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
 (Yes, very late reply, I'm sorry...)

 I do think it affects users of HMP, because right now you can delete
 snapshots with their ID, and after this series you cannot.
>>>
>>> That's true. My idea here was simple: the user can't reliably exclude
>>> via snapshot ID today
>>> because we're hiding the ID field in info snapshots:
>>>
>>>
>>>     (qemu) savevm 0
>>>     (qemu) info snapshots
>>>     List of snapshots present on all disks:
>>>     ID    TAG VM SIZE    DATE VM CLOCK
>>>     --    0  741M 2018-07-31 13:39:56 00:41:25.313
>>>
>>>
>>> Thus, what will end up happening is that the user will be forced to use the
>>> TAG of the snapshot since this is the only available information.
>>
>> But you can get it through e.g. qemu-img info.
>>
>> Snapshot list:
>> IDTAG VM SIZEDATE   VM CLOCK
>> 1 0  1.6M 2019-01-09 18:01:21   00:00:02.657
>>
>> So it's not impossible to get.
> 
> Is there a reason why we should display the ID at all when you can't use
> it any more to identify snapshots?

I thought the long-term goal of this series really was to remove all
mentions of the ID, yes.

 I think we had a short discussion about just disallowing numeric
 snapshot names.  How bad would that be?
>>>
>>>
>>> This was my first idea when evaluating what to do in this case. I gave
>>> it up because
>>> I found it to be too extreme. People would start complaining "I was able
>>> to do
>>> savevm 0 and now I can't".
>>
>> True.  But it wouldn't be impossible to do, we'd need to deprecate
>> numeric names, print a warning for two releases, and then we can make it
>> an error.
> 
> This. Is. HMP.
> 
> Not a stable ABI, no deprecation period of two releases.

Well, if you want to do it.

This may be HMP, but this is also the only interface to savevm, so it's
not like users have a choice to use a more stable interface.  I know
that was a conscious decision, more or less, but I don't see why we need
to be so nasty when the hardest thing about doing a nice deprecation
would be to remember to make it an error in half a year.

>> Hm...  If we had a proper deprecation warning in this series, I suppose
>> it wouldn't be dangerous anymore.  Can we just print a warning whenever
>> the user specified an ID?  (i.e. if some snapshot's ID matches the
>> string given by the user and the snapshot's name does not)
> 
> How is it less of a problem when a user gets QEMU from the distro and
> skips five releases? They will never see the warning. This is different
> from QMP where things are fixed in libvirt, which is tested with every
> single QEMU release.

Well, then allowing users to accidentally specify the wrong snapshot
remains adventurous.

The thing is that this really is the only interface to savevm.  So while
it may be HMP, it probably won't be used only by plain human end users
who just don't want to deal with QMP.  I can imagine it is used with
scripts and by people who regularly update their qemu because they have
something important running on top of it.


Actually, to me what you're saying sounds more like "Our deprecation
policy is useless" to which I wholeheartedly agree.  I think we should
only remove things in major releases, and only if it was deprecated in
the previous major release already.  (So if y

Re: [Qemu-devel] [PATCH v3] qdev/core: fix qbus_is_full()

2019-01-09 Thread Halil Pasic

On Wed, 9 Jan 2019 10:36:11 -0500
Tony Krowiak  wrote:

> On 1/9/19 5:14 AM, Cornelia Huck wrote:
> > On Tue, 8 Jan 2019 15:34:37 -0500
> > Tony Krowiak  wrote:
> > 
> >> On 1/8/19 12:06 PM, Cornelia Huck wrote:
> >>> On Tue, 8 Jan 2019 17:50:21 +0100
> >>> Halil Pasic  wrote:
> >>>
>  On Tue, 8 Jan 2019 17:31:13 +0100
>  Cornelia Huck  wrote:
>    
> > On Tue, 8 Jan 2019 11:08:56 -0500
> > Tony Krowiak  wrote:
> >   
> >> On 12/17/18 10:57 AM, Tony Krowiak wrote:
> >>> The qbus_is_full(BusState *bus) function (qdev_monitor.c) compares 
> >>> the max_index
> >>> value of the BusState structure with the max_dev value of the 
> >>> BusClass structure
> >>> to determine whether the maximum number of children has been reached 
> >>> for the
> >>> bus. The problem is, the max_index field of the BusState structure 
> >>> does not
> >>> necessarily reflect the number of devices that have been plugged into
> >>> the bus.
> >>>
> >>> Whenever a child device is plugged into the bus, the bus's max_index 
> >>> value is
> >>> assigned to the child device and then incremented. If the child is 
> >>> subsequently
> >>> unplugged, the value of the max_index does not change and no longer 
> >>> reflects the
> >>> number of children.
> >>>
> >>> When the bus's max_index value reaches the maximum number of devices
> >>> allowed for the bus (i.e., the max_dev field in the BusClass 
> >>> structure),
> >>> attempts to plug another device will be rejected claiming that the 
> >>> bus is
> >>> full -- even if the bus is actually empty.
> >>>
> >>> To resolve the problem, a new 'num_children' field is being added to 
> >>> the
> >>> BusState structure to keep track of the number of children plugged 
> >>> into the
> >>> bus. It will be incremented when a child is plugged, and decremented 
> >>> when a
> >>> child is unplugged.
> >>>
> >>> Signed-off-by: Tony Krowiak 
> >>> Reviewed-by: Pierre Morel
> >>> Reviewed-by: Halil Pasic 
> >>> ---
> >>> hw/core/qdev.c | 3 +++
> >>> include/hw/qdev-core.h | 1 +
> >>> qdev-monitor.c | 2 +-
> >>> 3 files changed, 5 insertions(+), 1 deletion(-)
> >>
> >> Ping ...
> >> I could not determine who the maintainer is for the three files
> >> listed above. I checked the MAINTAINERS file and the prologue of each
> >> individual file. Can someone please tell me who is responsible
> >> for merging these changes? Any additional review comments?
> >>   
> >>>
> >>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> >>> index 6b3cc55b27c2..956923f33520 100644
> >>> --- a/hw/core/qdev.c
> >>> +++ b/hw/core/qdev.c
> >>> @@ -59,6 +59,8 @@ static void bus_remove_child(BusState *bus, 
> >>> DeviceState *child)
> >>> snprintf(name, sizeof(name), "child[%d]", kid->index);
> >>> QTAILQ_REMOVE(&bus->children, kid, sibling);
> >>> 
> >>> +bus->num_children--;
> >>> +
> >>> /* This gives back ownership of kid->child back to 
> >>> us.  */
> >>> object_property_del(OBJECT(bus), name, NULL);
> >>> object_unref(OBJECT(kid->child));
> >>> @@ -73,6 +75,7 @@ static void bus_add_child(BusState *bus, 
> >>> DeviceState *child)
> >>> char name[32];
> >>> BusChild *kid = g_malloc0(sizeof(*kid));
> >>> 
> >>> +bus->num_children++;
> >>> kid->index = bus->max_index++;
> >
> > Hm... I'm wondering what happens for insane numbers of hotplugging
> > operations here?
> >
> > (Preexisting problem for busses without limit, but busses with a limit
> > could now run into that as well.)
> >   
> 
>  How does this patch change things? I mean bus->num_children gets
>  decremented on unplug.
> >>>
> >>> We don't stop anymore if max_index >= max_dev, which means that we can
> >>> now trigger that even if max_dev != 0.
> >>
> >> I guess I am missing your point. If max_dev == 0, then there is nothing
> >> stopping an insane number of hot plug operations; either before this
> >> patch, or with this patch. With the patch, once the number of children
> >> hot plugged reaches max_dev, the qbus_is_full function will return false
> >> and no more children can be plugged. If a child device is unplugged,
> >> the num_children - which counts the number of children attached to the
> >> bus - will be decremented, so it always reflects the number of children
> >> added to the bus. Besides, checking max_index against max_dev
> >> is erroneous, because max_index is incremented every time a child device
> >> is plugged and is never decremented. It really operates as more of a
> >> uinique identifier than a counter and is also used to crea

[Qemu-devel] Internship idea: virtio-blk oss-fuzz support

2019-01-09 Thread Stefan Hajnoczi

Hi folks,
I'd like to start fuzzing emulated devices in QEMU.  Here is an
internship project idea I'm proposing to do this.

Any thoughts?  Want to co-mentor this in Google Summer of Code or Outreachy?

Stefan

'''Summary:''' Integrate oss-fuzz into QEMU so that the virtio-blk
device can be fuzz tested.

oss-fuzz offers a fuzz testing service to open source projects.  This
means random inputs are continuously tested against the program in
order to find crashes and other bugs.  Fuzz testing complements
hand-written test suites by exploring the input space of a program and
therefore the code paths that may be taken.

The goal of this project is to integrate oss-fuzz into QEMU so that
the virtio-blk-pci device can be fuzzed at both the VIRTIO and PCI bus
level.  virtio-blk-pci is a PCI device, which means it is connected to
the virtual machine's PCI bus and has a certain set of registers that
can be programmed by the guest.  Furthermore, it is a VIRTIO device -
this is the specification the describes most of the functionality of
virtio-blk.  Bugs exist at both the PCI and VIRTIO levels, so it's
important to fuzz both of them.

Fuzzing emulated devices involves accessing their hardware registers
randomly to make the device respond.  QEMU has a device testing
interface called "qtest" that accepts read/write and other commands
over a socket and is ideal for writing device-level tests.  You may
find that oss-fuzz works better integrated directly into the QEMU
program instead of as a separate qtest program, so you can consider
adding a new command-line option to QEMU for running in oss-fuzz mode.

This project involves learning about VIRTIO and PCI devices, as well
as figuring out how to integrate oss-fuzz into QEMU so that it can
effective explore the code paths in virtio-blk device emulation code.
You will enjoy this project if you want to learn how device emulation
works and are interested in fuzzers.

'''Links:'''
* [https://github.com/google/oss-fuzz/blob/master/docs/ideal_integration.md
oss-fuzz integration overview]
* 
[https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md
libfuzzer tutorial]
* [http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html
VIRTIO specification]
* [https://wiki.osdev.org/PCI PCI bus overview]

'''Details:'''
* Skill level: intermediate
* Language: C
* Mentor: Stefan Hajnoczi

Re: [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2019-01-09 Thread Daniel Henrique Barboza





On 1/9/19 3:05 PM, Max Reitz wrote:

On 09.01.19 17:57, Daniel Henrique Barboza wrote:


On 1/9/19 12:10 PM, Max Reitz wrote:

On 06.09.18 13:11, Daniel Henrique Barboza wrote:

changes in v2:
- removed the "RFC" marker;
- added a new patch (patch 2) that removes
bdrv_snapshot_delete_by_id_or_name from the code;
- made changes in patch 1 as suggested by Murilo;
- previous patch set link:
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html


It is not uncommon to see bugs being opened by testers that attempt to
create VM snapshots using HMP. It turns out that "0" and "1" are quite
common snapshot names and they trigger a lot of bugs. I gave an example
in the commit message of patch 1, but to sum up here: QEMU treats the
input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
is documented as such, but this can lead to strange situations.

Given that it is strange for an API to consider a parameter to be 2
fields
at the same time, and inadvently treating them as one or the other, and
that removing the ID field is too drastic, my idea here is to keep the
ID field for internal control, but do not let the user set it.

I guess there's room for discussion about considering this change an API
change or not. It doesn't affect users of HMP and it doesn't affect
Libvirt,
but simplifying the meaning of the parameters of savevm/loadvm/delvm.

(Yes, very late reply, I'm sorry...)

I do think it affects users of HMP, because right now you can delete
snapshots with their ID, and after this series you cannot.

That's true. My idea here was simple: the user can't reliably exclude
via snapshot ID today
because we're hiding the ID field in info snapshots:


     (qemu) savevm 0
     (qemu) info snapshots
     List of snapshots present on all disks:
     ID    TAG VM SIZE    DATE VM CLOCK
     --    0  741M 2018-07-31 13:39:56 00:41:25.313


Thus, what will end up happening is that the user will be forced to use the
TAG of the snapshot since this is the only available information.

But you can get it through e.g. qemu-img info.

Snapshot list:
IDTAG VM SIZEDATE   VM CLOCK
1 0  1.6M 2019-01-09 18:01:21   00:00:02.657

So it's not impossible to get.


Hmpf  should we obscure the ID in this case as well? I mean, why we
have the ID information here but not in "info snapshots"?



I think we had a short discussion about just disallowing numeric
snapshot names.  How bad would that be?


This was my first idea when evaluating what to do in this case. I gave
it up because
I found it to be too extreme. People would start complaining "I was able
to do
savevm 0 and now I can't".

True.  But it wouldn't be impossible to do, we'd need to deprecate
numeric names, print a warning for two releases, and then we can make it
an error.

Hm...  If we had a proper deprecation warning in this series, I suppose
it wouldn't be dangerous anymore.  Can we just print a warning whenever
the user specified an ID?  (i.e. if some snapshot's ID matches the
string given by the user and the snapshot's name does not)



I can live with that.





Max

Re: [Qemu-devel] [PATCH v3 1/6] smbus: Add a helper to generate SPD EEPROM data