Re: [PATCH 10/10] vhost-user-blk: propagate error return from generic vhost

2021-11-29 Thread Raphael Norwitz
Ditto - not for 6.2.

I'm happy with this once the vhost and vhost-user patches go in.

Looks like vhost-user-vgpu, vhost-user-input and vhost-user-vsock also
return -1 on vhost_user_*_handle_config_change, so presumably those
should be fixed too.

On Thu, Nov 11, 2021 at 06:33:54PM +0300, Roman Kagan wrote:
> Fix the only callsite that doesn't propagate the error code from the
> generic vhost code.
> 
> Signed-off-by: Roman Kagan 
> ---

Reviewed-by: Raphael Norwitz 

>  hw/block/vhost-user-blk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index f9b17f6813..ab11ce8252 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -100,7 +100,7 @@ static int vhost_user_blk_handle_config_change(struct 
> vhost_dev *dev)
> _err);
>  if (ret < 0) {
>  error_report_err(local_err);
> -return -1;
> +return ret;
>  }
>  
>  /* valid for resize only */
> -- 
> 2.33.1
> 


Re: [PATCH 01/10] vhost-user-blk: reconnect on any error during realize

2021-11-29 Thread Raphael Norwitz
As mst said, not for 6.2.

On Thu, Nov 11, 2021 at 06:33:45PM +0300, Roman Kagan wrote:
> vhost-user-blk realize only attempts to reconnect if the previous
> connection attempt failed on "a problem with the connection and not an
> error related to the content (which would fail again the same way in the
> next attempt)".
> 
> However this distinction is very subtle, and may be inadvertently broken
> if the code changes somewhere deep down the stack and a new error gets
> propagated up to here.
> 
> OTOH now that the number of reconnection attempts is limited it seems
> harmless to try reconnecting on any error.
> 
> So relax the condition of whether to retry connecting to check for any
> error.
> 
> This patch amends a527e312b5 "vhost-user-blk: Implement reconnection
> during realize".
> 
> Signed-off-by: Roman Kagan 

Reviewed-by: Raphael Norwitz 

> ---
>  hw/block/vhost-user-blk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index ba13cb87e5..f9b17f6813 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -511,7 +511,7 @@ static void vhost_user_blk_device_realize(DeviceState 
> *dev, Error **errp)
>  *errp = NULL;
>  }
>  ret = vhost_user_blk_realize_connect(s, errp);
> -} while (ret == -EPROTO && retries--);
> +} while (ret < 0 && retries--);
>  
>  if (ret < 0) {
>  goto virtio_err;
> -- 
> 2.33.1
> 


Re: [PATCH 01/10] vhost-user-blk: reconnect on any error during realize

2021-11-29 Thread Raphael Norwitz
> > 
> > I see. I hadn't looked at the rest of the series yet because I ran out
> > of time, but now that I'm skimming them, I see quite a few places that
> > use non-EPROTO, but I wonder which of them actually should be
> > reconnected. So far all I saw were presumably persistent errors where a
> > retry won't help. Can you give me some examples?
> 
> E.g. the particular case you mention earlier, -ECONNREFUSED, is not
> unlikely to happen due to the vhost-user server restart for maintenance;
> in this case retying looks like a reasonable thing to do, doesn't it?
>

Seems like a net-positive to me, expecially with the cleanups in the
rest of the series, but I don't feel strongly.

> Thanks,
> Roman.
> 


Re: [PATCH] hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk device

2021-11-29 Thread Raphael Norwitz
Just a commit message nit. Otherwise I'm happy with this. OFC should not
be queued for 6.2.

On Fri, Nov 26, 2021 at 10:00:18AM +0800, Andy Pei wrote:
> Turn on pre-defined feature VIRTIO_BLK_F_SIZE_MAX virtio blk device
> to avoid guest DMA request size is too large to exceed hardware spec.

Grammar here. Should be something like "...DMA request sizes which are
to large for the hardware spec".

> 
> Signed-off-by: Andy Pei 

Acked-by: Raphael Norwitz 

> ---
>  hw/block/vhost-user-blk.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index ba13cb8..eb1264a 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -252,6 +252,7 @@ static uint64_t vhost_user_blk_get_features(VirtIODevice 
> *vdev,
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
>  
>  /* Turn on pre-defined features */
> +virtio_add_feature(, VIRTIO_BLK_F_SIZE_MAX);
>  virtio_add_feature(, VIRTIO_BLK_F_SEG_MAX);
>  virtio_add_feature(, VIRTIO_BLK_F_GEOMETRY);
>  virtio_add_feature(, VIRTIO_BLK_F_TOPOLOGY);
> -- 
> 1.8.3.1
> 


[PATCH for-6.2 v2] block/nbd: forbid incompatible change of server options on reconnect

2021-11-29 Thread Vladimir Sementsov-Ogievskiy
Reconnect feature was never prepared to handle server options changed
on reconnect. Let's be stricter and check what exactly is changed. If
server capabilities just got richer don't worry. Otherwise fail and
drop the established connection.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: by Eric's comments:
 - drop extra check about old->min_block % new->min_block
 - make context_id check conditional itself
 - don't handle READ_ONLY flag here (see comment in code)
 - wording

 Code seems quite obvious, but honestly I still didn't test that it does
 what it should :( And I'm afraid, Qemu actually doesn't provide good
 possibility to do so.

 Eric, may be you know some simple way to test it with nbdkit?

 include/block/nbd.h |  9 +
 nbd/client-connection.c | 88 +
 2 files changed, 97 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 78d101b774..9e1943d24c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -157,6 +157,10 @@ enum {
 #define NBD_FLAG_SEND_RESIZE   (1 << NBD_FLAG_SEND_RESIZE_BIT)
 #define NBD_FLAG_SEND_CACHE(1 << NBD_FLAG_SEND_CACHE_BIT)
 #define NBD_FLAG_SEND_FAST_ZERO(1 << NBD_FLAG_SEND_FAST_ZERO_BIT)
+/*
+ * WARNING! If you add any new NBD_FLAG_ flag, check that logic in
+ * nbd_is_new_info_compatible() is still good about handling flags.
+ */
 
 /* New-style handshake (global) flags, sent from server to client, and
control what will happen during handshake phase. */
@@ -305,6 +309,11 @@ struct NBDExportInfo {
 
 uint32_t context_id;
 
+/*
+ * WARNING! When adding any new field to the structure, don't forget
+ * to check and update the nbd_is_new_info_compatible() function.
+ */
+
 /* Set by server results during nbd_receive_export_list() */
 char *description;
 int n_contexts;
diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 695f855754..d50c187482 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -37,6 +37,10 @@ struct NBDClientConnection {
 bool do_negotiation;
 bool do_retry;
 
+/* Used only by connection thread, does not need mutex protection */
+bool has_prev_info;
+NBDExportInfo prev_info;
+
 QemuMutex mutex;
 
 /*
@@ -160,6 +164,69 @@ static int nbd_connect(QIOChannelSocket *sioc, 
SocketAddress *addr,
 return 0;
 }
 
+static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo *new,
+   Error **errp)
+{
+uint32_t dropped_flags;
+
+if (old->structured_reply && !new->structured_reply) {
+error_setg(errp, "Server options degraded after reconnect: "
+   "structured_reply is not supported anymore");
+return false;
+}
+
+if (old->base_allocation) {
+if (!new->base_allocation) {
+error_setg(errp, "Server options degraded after reconnect: "
+   "base_allocation is not supported anymore");
+return false;
+}
+
+if (old->context_id != new->context_id) {
+error_setg(errp, "Meta context id changed after reconnect");
+return false;
+}
+}
+
+if (old->size != new->size) {
+error_setg(errp, "NBD export size changed after reconnect");
+return false;
+}
+
+/*
+ * No worry if rotational status changed.
+ *
+ * Also, we can't handle NBD_FLAG_READ_ONLY properly at this level: we 
don't
+ * actually know, does our client need write access or not. So, it's 
handled
+ * in block layer in nbd_handle_updated_info().
+ *
+ * All other flags are feature flags, they should not degrade.
+ */
+dropped_flags = (old->flags & ~new->flags) &
+~(NBD_FLAG_ROTATIONAL | NBD_FLAG_READ_ONLY);
+if (dropped_flags) {
+error_setg(errp, "Server options degraded after reconnect: flags 0x%"
+   PRIx32 " are not reported anymore", dropped_flags);
+return false;
+}
+
+if (new->min_block > old->min_block) {
+error_setg(errp, "Server requires more strict min_block after "
+   "reconnect: %" PRIu32 " instead of %" PRIu32,
+   new->min_block, old->min_block);
+return false;
+}
+
+if (new->max_block < old->max_block) {
+error_setg(errp, "Server requires more strict max_block after "
+   "reconnect: %" PRIu32 " instead of %" PRIu32,
+   new->max_block, old->max_block);
+return false;
+}
+
+return true;
+}
+
 static void *connect_thread_func(void *opaque)
 {
 NBDClientConnection *conn = opaque;
@@ -183,6 +250,27 @@ static void *connect_thread_func(void *opaque)
   conn->do_negotiation ? >updated_info : NULL,
   conn->tlscreds, >ioc, >err);
 
+if (ret == 0) {
+if (conn->has_prev_info &&
+

Re: [PATCH 00/10] vhost: stick to -errno error return convention

2021-11-29 Thread Roman Kagan
On Sun, Nov 28, 2021 at 04:47:20PM -0500, Michael S. Tsirkin wrote:
> On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote:
> > Error propagation between the generic vhost code and the specific backends 
> > is
> > not quite consistent: some places follow "return -1 and set errno" 
> > convention,
> > while others assume "return negated errno".  Furthermore, not enough care is
> > taken not to clobber errno.
> > 
> > As a result, on certain code paths the errno resulting from a failure may 
> > get
> > overridden by another function call, and then that zero errno inidicating
> > success is propagated up the stack, leading to failures being lost.  In
> > particular, we've seen errors in the communication with a vhost-user-blk 
> > slave
> > not trigger an immediate connection drop and reconnection, leaving it in a
> > broken state.
> > 
> > Rework error propagation to always return negated errno on errors and
> > correctly pass it up the stack.
> 
> Hi Roman,
> if there are bugfixes here I'll be happy to take them right now.
> The wholesale rework seems inappropriate for 6.2, I'll be
> happy to tag it for after 6.2. Pls ping me aftre release to help
> make sure it's not lost.

All these patches are bugfixes in one way or another.  That said, none
of the problems being addressed are recent regressions.  OTOH the
patches introduce non-zero churn and change behavior on some error
paths, so I'd suggest to postpone the whole series till after 6.2 is
out.

Thanks,
Roman.



Re: [RFC for-6.2] block/nbd: forbid incompatible change of server options on reconnect

2021-11-29 Thread Vladimir Sementsov-Ogievskiy

29.11.2021 22:16, Eric Blake wrote:

On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote:

Reconnect feature was never prepared to handle server options changed
on reconnect. Let's be stricter and check what exactly is changed. If
server capabilities just got richer don't worry. Otherwise fail and
drop the established connection.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
+/*
+ * No worry if rotational status changed. But other flags are feature 
flags,
+ * they should not degrade.
+ */
+dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL;
+if (dropped_flags) {
+error_setg(errp, "Server options degrade after reconnect: flags 0x%"
+   PRIx32 " are not reported anymore", dropped_flags);
+return false;
+}


Your logic is good for most flags, but somewhat wrong for
NBD_FLAG_READ_ONLY_BIT.  For cases where we are only using the block
device read-only, we don't care about changes of that bit, in either
direction.  But for cases where we want to use the block device
read-write, the bit changing from clear in the old to set in the new
server is an incompatible change that your logic failed to flag.



Oh right! Will fix it and resend soon.

--
Best regards,
Vladimir



Re: [RFC for-6.2] block/nbd: forbid incompatible change of server options on reconnect

2021-11-29 Thread Vladimir Sementsov-Ogievskiy

29.11.2021 20:34, Eric Blake wrote:

On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote:

Reconnect feature was never prepared to handle server options changed
on reconnect. Let's be stricter and check what exactly is changed. If
server capabilities just got richer don't worry. Otherwise fail and
drop the established connection.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! The patch is probably good for 6.2. It's an RFC because I didn't
test it yet) But I want to early send, so that my proposed design be
available for discussion.


We're cutting it awfully close.  My justification for including it in
-rc3 (if we like it) is that it is a lot easier to audit that we
reject server downgrades than it is to audit whether we have a CVE
because of a server downgrade across a reconnect.  But it is not a new
regression to 6.2, so slipping it to 7.0 (if we don't feel comfortable
with the current iteration of the patch) is okay on that front.




  include/block/nbd.h |  9 +
  nbd/client-connection.c | 86 +
  2 files changed, 95 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 78d101b774..3d379b5539 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -157,6 +157,10 @@ enum {
  #define NBD_FLAG_SEND_RESIZE   (1 << NBD_FLAG_SEND_RESIZE_BIT)
  #define NBD_FLAG_SEND_CACHE(1 << NBD_FLAG_SEND_CACHE_BIT)
  #define NBD_FLAG_SEND_FAST_ZERO(1 << NBD_FLAG_SEND_FAST_ZERO_BIT)
+/*
+ * If you add any new NBD_FLAG_ flag, check that logic in
+ * nbd_is_new_info_compatible() is still good about handling flags.
+ */
  
  /* New-style handshake (global) flags, sent from server to client, and

 control what will happen during handshake phase. */
@@ -305,6 +309,11 @@ struct NBDExportInfo {
  
  uint32_t context_id;
  
+/*

+ * WARNING! when add any new field to the structure, don't forget to check


adding


+ * and updated nbd_is_new_info_compatible() function.


update the


+ */


Odd that one comment has WARNING! and the other does not.


+
  /* Set by server results during nbd_receive_export_list() */
  char *description;
  int n_contexts;
diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 695f855754..2d66993632 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -37,6 +37,10 @@ struct NBDClientConnection {
  bool do_negotiation;
  bool do_retry;
  
+/* Used only by connection thread, no need in locking the mutex */


s/no need in locking the mutex/does not need mutex protection/


+bool has_prev_info;
+NBDExportInfo prev_info;
+
  QemuMutex mutex;
  
  /*

@@ -160,6 +164,67 @@ static int nbd_connect(QIOChannelSocket *sioc, 
SocketAddress *addr,
  return 0;
  }
  
+static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo *new,

+   Error **errp)
+{
+uint32_t dropped_flags;
+
+if (old->structured_reply && !new->structured_reply) {
+error_setg(errp, "Server options degrade after reconnect: "


degraded


+   "structured_reply is not supported anymore");
+return false;
+}
+
+if (old->base_allocation && !new->base_allocation) {
+error_setg(errp, "Server options degrade after reconnect: "


degraded


+   "base_allocation is not supported anymore");
+return false;
+}


Do we also need to insist that the context id value be identical, or
can our code gracefully deal with it being different?  We don't ever
send the context id, so even if we retry a CMD_BLOCK_STATUS, our real
risk is whether we will reject the new server's reply because it used
a different id than we were expecting.


+
+if (old->size != new->size) {
+error_setg(errp, "NBD export size changed after reconnect");
+return false;
+}
+
+/*
+ * No worry if rotational status changed. But other flags are feature 
flags,
+ * they should not degrade.
+ */
+dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL;
+if (dropped_flags) {
+error_setg(errp, "Server options degrade after reconnect: flags 0x%"


degraded


+   PRIx32 " are not reported anymore", dropped_flags);
+return false;
+}
+
+if (new->min_block > old->min_block) {
+error_setg(errp, "Server requires more strict min_block after "
+   "reconnect: %" PRIu32 " instead of %" PRIu32,
+   new->min_block, old->min_block);
+return false;
+}


Good...


+if (new->min_block && (old->min_block % new->min_block)) {
+error_setg(errp, "Server requires new min_block %" PRIu32
+   " after reconnect, incompatible with old one %" PRIu32,
+   new->min_block, old->min_block);
+return false;
+}


...but why is this one necessary?  Since min_block has to be a power

Re: [RFC for-6.2] block/nbd: forbid incompatible change of server options on reconnect

2021-11-29 Thread Eric Blake
On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote:
> Reconnect feature was never prepared to handle server options changed
> on reconnect. Let's be stricter and check what exactly is changed. If
> server capabilities just got richer don't worry. Otherwise fail and
> drop the established connection.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> +/*
> + * No worry if rotational status changed. But other flags are feature 
> flags,
> + * they should not degrade.
> + */
> +dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL;
> +if (dropped_flags) {
> +error_setg(errp, "Server options degrade after reconnect: flags 0x%"
> +   PRIx32 " are not reported anymore", dropped_flags);
> +return false;
> +}

Your logic is good for most flags, but somewhat wrong for
NBD_FLAG_READ_ONLY_BIT.  For cases where we are only using the block
device read-only, we don't care about changes of that bit, in either
direction.  But for cases where we want to use the block device
read-write, the bit changing from clear in the old to set in the new
server is an incompatible change that your logic failed to flag.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [RFC for-6.2] block/nbd: forbid incompatible change of server options on reconnect

2021-11-29 Thread Eric Blake
On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote:
> Reconnect feature was never prepared to handle server options changed
> on reconnect. Let's be stricter and check what exactly is changed. If
> server capabilities just got richer don't worry. Otherwise fail and
> drop the established connection.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> Hi all! The patch is probably good for 6.2. It's an RFC because I didn't
> test it yet) But I want to early send, so that my proposed design be
> available for discussion.

We're cutting it awfully close.  My justification for including it in
-rc3 (if we like it) is that it is a lot easier to audit that we
reject server downgrades than it is to audit whether we have a CVE
because of a server downgrade across a reconnect.  But it is not a new
regression to 6.2, so slipping it to 7.0 (if we don't feel comfortable
with the current iteration of the patch) is okay on that front.

> 
> 
>  include/block/nbd.h |  9 +
>  nbd/client-connection.c | 86 +
>  2 files changed, 95 insertions(+)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 78d101b774..3d379b5539 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -157,6 +157,10 @@ enum {
>  #define NBD_FLAG_SEND_RESIZE   (1 << NBD_FLAG_SEND_RESIZE_BIT)
>  #define NBD_FLAG_SEND_CACHE(1 << NBD_FLAG_SEND_CACHE_BIT)
>  #define NBD_FLAG_SEND_FAST_ZERO(1 << NBD_FLAG_SEND_FAST_ZERO_BIT)
> +/*
> + * If you add any new NBD_FLAG_ flag, check that logic in
> + * nbd_is_new_info_compatible() is still good about handling flags.
> + */
>  
>  /* New-style handshake (global) flags, sent from server to client, and
> control what will happen during handshake phase. */
> @@ -305,6 +309,11 @@ struct NBDExportInfo {
>  
>  uint32_t context_id;
>  
> +/*
> + * WARNING! when add any new field to the structure, don't forget to 
> check

adding

> + * and updated nbd_is_new_info_compatible() function.

update the

> + */

Odd that one comment has WARNING! and the other does not.

> +
>  /* Set by server results during nbd_receive_export_list() */
>  char *description;
>  int n_contexts;
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 695f855754..2d66993632 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -37,6 +37,10 @@ struct NBDClientConnection {
>  bool do_negotiation;
>  bool do_retry;
>  
> +/* Used only by connection thread, no need in locking the mutex */

s/no need in locking the mutex/does not need mutex protection/

> +bool has_prev_info;
> +NBDExportInfo prev_info;
> +
>  QemuMutex mutex;
>  
>  /*
> @@ -160,6 +164,67 @@ static int nbd_connect(QIOChannelSocket *sioc, 
> SocketAddress *addr,
>  return 0;
>  }
>  
> +static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo 
> *new,
> +   Error **errp)
> +{
> +uint32_t dropped_flags;
> +
> +if (old->structured_reply && !new->structured_reply) {
> +error_setg(errp, "Server options degrade after reconnect: "

degraded

> +   "structured_reply is not supported anymore");
> +return false;
> +}
> +
> +if (old->base_allocation && !new->base_allocation) {
> +error_setg(errp, "Server options degrade after reconnect: "

degraded

> +   "base_allocation is not supported anymore");
> +return false;
> +}

Do we also need to insist that the context id value be identical, or
can our code gracefully deal with it being different?  We don't ever
send the context id, so even if we retry a CMD_BLOCK_STATUS, our real
risk is whether we will reject the new server's reply because it used
a different id than we were expecting.

> +
> +if (old->size != new->size) {
> +error_setg(errp, "NBD export size changed after reconnect");
> +return false;
> +}
> +
> +/*
> + * No worry if rotational status changed. But other flags are feature 
> flags,
> + * they should not degrade.
> + */
> +dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL;
> +if (dropped_flags) {
> +error_setg(errp, "Server options degrade after reconnect: flags 0x%"

degraded

> +   PRIx32 " are not reported anymore", dropped_flags);
> +return false;
> +}
> +
> +if (new->min_block > old->min_block) {
> +error_setg(errp, "Server requires more strict min_block after "
> +   "reconnect: %" PRIu32 " instead of %" PRIu32,
> +   new->min_block, old->min_block);
> +return false;
> +}

Good...

> +if (new->min_block && (old->min_block % new->min_block)) {
> +error_setg(errp, "Server requires new min_block %" PRIu32
> +   " after reconnect, incompatible with old one %" PRIu32,
> +  

Re: [PATCH 0/4] block: minor refactoring in preparation to the block layer API split

2021-11-29 Thread Stefan Hajnoczi
On Wed, Nov 24, 2021 at 01:36:36AM -0500, Emanuele Giuseppe Esposito wrote:
> These patches are taken from my old patches and feedback of
> my series "block layer: split block APIs in global state and I/O".
> 
> The reason for a separate series is that the original one is
> already too long, and these patches are just refactoring the code,
> mainly deleting or moving functions in blockdev.h and block_int.h.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> 
> Emanuele Giuseppe Esposito (4):
>   block_int: make bdrv_backing_overridden static
>   include/sysemu/blockdev.h: rename if_name in block_if_name
>   include/sysemu/blockdev.h: move drive_add and inline drive_def
>   include/sysemu/blockdev.h: remove drive_get_max_devs
> 
>  include/block/block_int.h  |  3 --
>  include/sysemu/blockdev.h  |  7 ++---
>  block.c|  4 ++-
>  block/monitor/block-hmp-cmds.c |  2 +-
>  blockdev.c | 54 --
>  softmmu/vl.c   | 25 +++-
>  6 files changed, 36 insertions(+), 59 deletions(-)
> 
> -- 
> 2.27.0
> 

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH v5 00/31] block layer: split block APIs in global state and I/O

2021-11-29 Thread Stefan Hajnoczi
On Wed, Nov 24, 2021 at 01:43:47AM -0500, Emanuele Giuseppe Esposito wrote:
> v5 -> v6:
> * In short, apply all Hanna's comments. More in details,
>   the following functions in the following headers have been moved:
>   block-backend:
> blk_replace_bs (to gs)
> blk_nb_sectors (to io)
> blk_name (to io)
> blk_set_perm (to io)
> blk_get_perm (to io)
> blk_drain (to io)
> blk_abort_aio_request (to io)
> blk_make_empty (to gs)
> blk_invalidate_cache (was in io, but had GS assertion)
> blk_aio_cancel (to gs)
>   block:
> bdrv_replace_child_bs (to gs)
> bdrv_get_device_name (to io)
> bdrv_get_device_or_node_name (to io)
> bdrv_drained_end_no_poll (to io)
> bdrv_block_status (was in io, but had GS assertion)
> bdrv_drain (to io)
> bdrv_co_drain (to io)
> bdrv_make_zero (was in GS, but did not have the assertion)
> bdrv_save_vmstate (to io)
> bdrv_load_vmstate (to io)
> bdrv_aio_cancel_async (to io)
>   block_int:
> bdrv_get_parent_name (to io)
> bdrv_apply_subtree_drain (to io)
> bdrv_unapply_subtree_drain (to io)
> bdrv_co_copy_range_from (was in io, but had GS assertion)
> bdrv_co_copy_range_to (was in io, but had GS assertion)
> ->bdrv_save_vmstate (to io)
> ->bdrv_load_vmstate (to io)
> 
>   coding style (assertion after definitions):
> bdrv_save_vmstate
> bdrv_load_vmstate
> block_job_next
> block_job_get
> 
>   new patches:
> block.c: modify .attach and .detach callbacks of child_of_bds
> introduce pre_run as JobDriver callback to handle
>   bdrv_co_amend usage of permission function
> leave blk_set/get_perm as a TODO in fuse.c
> make sure bdrv_co_invalidate_cache does not use permissions
>   if BQL is not held
> 
>   minor changes:
> put back TODO for include block/block.h in block-backend-common.h
> rebase on kwolf/block branch
> modify where are used assert_bdrv_graph_writable, due to rebase

These changes sound fine to me. Hanna or Kevin can merge the series when
they are happy.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 06/13] hw/arm/mcimx6ul-evk: Replace drive_get_next() by drive_get()

2021-11-29 Thread Peter Maydell
On Wed, 17 Nov 2021 at 16:34, Markus Armbruster  wrote:
>
> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If the
> order changes, or new calls appear "in the middle", unit numbers change.
> ABI break.  Hard to spot in review.
>
> Machine "mcimx6ul-evk" connects backends with drive_get_next() in a
> counting loop.  Change it to use drive_get() directly.  This makes the
> unit numbers explicit in the code.
>
> Cc: Peter Maydell 
> Cc: Jean-Christophe Dubois 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v2 07/13] hw/arm/mcimx7d-sabre: Replace drive_get_next() by drive_get()

2021-11-29 Thread Peter Maydell
On Wed, 17 Nov 2021 at 16:34, Markus Armbruster  wrote:
>
> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If the
> order changes, or new calls appear "in the middle", unit numbers change.
> ABI break.  Hard to spot in review.
>
> Machine "mcimx7d-sabre" connects backends with drive_get_next() in a
> counting loop.  Change it to use drive_get() directly.  This makes the
> unit numbers explicit in the code.
>
> Cc: Peter Maydell 
> Cc: Andrey Smirnov 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v2 05/13] hw/arm/imx25_pdk: Replace drive_get_next() by drive_get()

2021-11-29 Thread Peter Maydell
On Wed, 17 Nov 2021 at 16:34, Markus Armbruster  wrote:
>
> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If the
> order changes, or new calls appear "in the middle", unit numbers change.
> ABI break.  Hard to spot in review.
>
> Machine "imx25-pdk" connects backends with drive_get_next() in a
> counting loop.  Change it to use drive_get() directly.  This makes the
> unit numbers explicit in the code.
>
> Cc: Peter Maydell 
> Cc: Jean-Christophe Dubois 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v2 04/13] hw/arm/versatilepb hw/arm/vexpress: Replace drive_get_next() by drive_get()

2021-11-29 Thread Peter Maydell
On Wed, 17 Nov 2021 at 16:34, Markus Armbruster  wrote:
>
> drive_get_next() is basically a bad idea.  It returns the "next" block
> backend of a certain interface type.  "Next" means bus=0,unit=N, where
> subsequent calls count N up from zero, per interface type.
>
> This lets you define unit numbers implicitly by execution order.  If the
> order changes, or new calls appear "in the middle", unit numbers change.
> ABI break.  Hard to spot in review.
>
> The versatile and vexpress machines ("versatileab", "versatilepb",
> "vexpress-a9", "vexpress-a15") connect just one or two backends of a
> type with drive_get_next().  Change them to use drive_get() directly.
> This makes the unit numbers explicit in the code.
>
> Cc: Peter Maydell 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM