Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Markus Armbruster
John Snow  writes:

> On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>> It's a simpler alternative for
>> "[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
>><20200902181831.2570048-1-ebl...@redhat.com>
>>https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
>>https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/
>> Since we have "coroutine: true" feature for qmp commands, I think,
>> maybe we can merge allocation status to bitmap without bothering with
>> new block-job?
>> It's an RFC:
>> 1. Main question: is it OK as a simple blocking command, even in a
>> coroutine mode. It's a lot simpler, and it can be simply used in a
>> transaction with other bitmap commands.
>> 
>
> Hm, possibly... I did not follow the discussion of coroutine QMP
> commands closely to know what the qualifying criteria to use them are.
>
> (Any wisdom for me here, Markus?)

>From Kevin's cover letter:

Some QMP command handlers can block the main loop for a relatively
long time, for example because they perform some I/O.  This is quite
nasty.  Allowing such handlers to run in a coroutine where they can
yield (and therefore release the BQL) while waiting for an event
such as I/O completion solves the problem.

Running in a coroutine is not a replacement for jobs.  Monitor commands
continue to run one after the other, even with multiple monitors.  All
this does is letting monitor commands yield.

Running in a coroutine is opt-in, because we're scared of command code
misbehaving in coroutine context[*].  To opt-in, add

'coroutine': true

to the command's QAPI schema.

Misbehaving command code should be rare.  The trouble is finding it.  If
we had a better handle on that, we could make running in a coroutine
opt-out.  Watch out for nested event loops.  Test thoroughly.

Questions?

[...]

[*] Discussed at some length in patch review:

Message-ID: <874kwnvgad@dusky.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg05015.html




Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler

2021-04-27 Thread David Gibson
On Tue, Apr 27, 2021 at 11:20:07AM +0200, Philippe Mathieu-Daudé wrote:
> On 4/27/21 3:45 AM, David Gibson wrote:
> > On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote:
> >> The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset
> >> automatically.  Register a reset handler to get reset with the
> >> machine.
> >>
> >> It doesn't seem to be an issue because it is that way since the
> >> device QDev'ifycation 8 years ago, in commit a83000f5e3f
> >> ("spapr-tce: make sPAPRTCETable a proper device").
> >> Still, correct to have a proper API usage.
> > 
> > So, the reason this works now is that we explicitly call
> > device_reset() on the TCE table from the TCE tables "owner", either a
> > PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()).
> > 
> > I think we want either that, or the register_reset(), not both.
> 
> rtas_quiesce() seems to call a DeviceClass::reset() on the
> children of TYPE_SPAPR_VIO_BUS:
> 
> Abstract TYPE_VIO_SPAPR_DEVICE has the TYPE_SPAPR_VIO_BUS bus_type,
> and registers the spapr_vio_busdev_reset() handler, which calls
> spapr_vio_quiesce_one()...
> 
> So either we already have 2 resets, or the bus is never reset?

There are 2 resets, and this is intentional.  We reset once at machine
reset time, via the bus.  Once a booting OS is done with the firmware
it calls "quiesce" to put all the devices back into a safe state.  The
easiest way to do that is just to invoke their reset callbacks, so
that's what we do.

> The bus is created in spapr_machine_init():
> 
> /* Set up VIO bus */
> spapr->vio_bus = spapr_vio_bus_init();
> 
> TYPE_SPAPR_MACHINE class registers spapr_machine_reset(), which
> manually calls qemu_devices_reset() and spapr_drc_reset_all(),
> but I can't understand if a callee resets vio_bus...
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v2] floppy: remove dead code related to formatting

2021-04-27 Thread Alexander Bulekov
fdctrl_format_sector was added in
baca51faff ("updated floppy driver: formatting code, disk geometry auto detect 
(Jocelyn Mayer)")

The single callsite is guarded by a check:
fdctrl->data_state & FD_STATE_FORMAT

However, the only place where the FD_STATE_FORMAT flag is set (in
fdctrl_handle_format_track) is closely followed by the same flag being
unset, with no possibility to call fdctrl_format_sector in between.

This removes fdctrl_format_sector, the unncessary setting/unsetting
of the FD_STATE_FORMAT flag, and the fdctrl_handle_format_track function
(which is just a stub).

Suggested-by: Hervé Poussineau 
Signed-off-by: Alexander Bulekov 
---

I ran through tests/qtest/fdc-test, and ran fdformat on a dummy disk -
nothing exploded, but since I don't use floppies very often, more eyes
definitely won't hurt. In particular, I'm not sure about the
fdctrl_handle_format_track delete - that function has side-effects on
both FDrive and FDCtrl, and it is certainly reachable. If deleting the
whole thing seems wrong, I'll roll-back that change, and we can just
remove the unreachable code..

 hw/block/fdc.c | 97 --
 1 file changed, 97 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acba..d851d23cc0 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -657,7 +657,6 @@ enum {
 
 enum {
 FD_STATE_MULTI  = 0x01,/* multi track flag */
-FD_STATE_FORMAT = 0x02,/* format flag */
 };
 
 enum {
@@ -826,7 +825,6 @@ enum {
 };
 
 #define FD_MULTI_TRACK(state) ((state) & FD_STATE_MULTI)
-#define FD_FORMAT_CMD(state) ((state) & FD_STATE_FORMAT)
 
 struct FDCtrl {
 MemoryRegion iomem;
@@ -1942,67 +1940,6 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
 return retval;
 }
 
-static void fdctrl_format_sector(FDCtrl *fdctrl)
-{
-FDrive *cur_drv;
-uint8_t kh, kt, ks;
-
-SET_CUR_DRV(fdctrl, fdctrl->fifo[1] & FD_DOR_SELMASK);
-cur_drv = get_cur_drv(fdctrl);
-kt = fdctrl->fifo[6];
-kh = fdctrl->fifo[7];
-ks = fdctrl->fifo[8];
-FLOPPY_DPRINTF("format sector at %d %d %02x %02x (%d)\n",
-   GET_CUR_DRV(fdctrl), kh, kt, ks,
-   fd_sector_calc(kh, kt, ks, cur_drv->last_sect,
-  NUM_SIDES(cur_drv)));
-switch (fd_seek(cur_drv, kh, kt, ks, fdctrl->config & FD_CONFIG_EIS)) {
-case 2:
-/* sect too big */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, 0x00, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 3:
-/* track too big */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, FD_SR1_EC, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 4:
-/* No seek enabled */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, 0x00, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 1:
-fdctrl->status0 |= FD_SR0_SEEK;
-break;
-default:
-break;
-}
-memset(fdctrl->fifo, 0, FD_SECTOR_LEN);
-if (cur_drv->blk == NULL ||
-blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo,
-   BDRV_SECTOR_SIZE, 0) < 0) {
-FLOPPY_DPRINTF("error formatting sector %d\n", fd_sector(cur_drv));
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM | FD_SR0_SEEK, 0x00, 0x00);
-} else {
-if (cur_drv->sect == cur_drv->last_sect) {
-fdctrl->data_state &= ~FD_STATE_FORMAT;
-/* Last sector done */
-fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
-} else {
-/* More to do */
-fdctrl->data_pos = 0;
-fdctrl->data_len = 4;
-}
-}
-}
-
 static void fdctrl_handle_lock(FDCtrl *fdctrl, int direction)
 {
 fdctrl->lock = (fdctrl->fifo[0] & 0x80) ? 1 : 0;
@@ -2110,34 +2047,6 @@ static void fdctrl_handle_readid(FDCtrl *fdctrl, int 
direction)
  (NANOSECONDS_PER_SECOND / 50));
 }
 
-static void fdctrl_handle_format_track(FDCtrl *fdctrl, int direction)
-{
-FDrive *cur_drv;
-
-SET_CUR_DRV(fdctrl, fdctrl->fifo[1] & FD_DOR_SELMASK);
-cur_drv = get_cur_drv(fdctrl);
-fdctrl->data_state |= FD_STATE_FORMAT;
-if (fdctrl->fifo[0] & 0x80)
-fdctrl->data_state |= FD_STATE_MULTI;
-else
-fdctrl->data_state &= ~FD_STATE_MULTI;
-cur_drv->bps =
-fdctrl->fifo[2] > 7 ? 16384 : 128 << fdctrl->fifo[2];
-#if 0
-cur_drv->last_sect =
-cur_drv->flags & FDISK_DBL_SIDES ? fdctrl->fifo[3] :
-fdctrl->fifo[3] / 2;
-#else
-cur_drv->last_sect = fdctrl->fifo[3];
-#endif
-/* TODO: implement format using DMA expected by the Bochs BIOS
- * and Linux fdformat (read 3 bytes per sector via DMA and fill
- * the sector with the specified fill byte
- */
-fdctrl->data_state &= ~FD_STATE_FORMAT;
-

Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

27.04.2021 21:24, John Snow wrote:

On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
   <20200902181831.2570048-1-ebl...@redhat.com>
   https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
   https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.



Hm, possibly... I did not follow the discussion of coroutine QMP commands 
closely to know what the qualifying criteria to use them are.

(Any wisdom for me here, Markus?)


2. Transaction support is not here now. Will add in future version, if
general approach is OK.



That should be alright, I think. It means that the operation needs to succeed 
before the transaction returns success, though.

Depending on what else is in the transaction, do we run the risk of a deadlock 
if we need to wait for a coroutine to finish?


3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.



See above concern!


4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to _active_ bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.



That sounds fine for individual usage, but I can't convince myself it's safe 
for transactions.


qmp_transaction do drain itself.. Still, it's a bit strange that it does just 
drain and not drained section around the whole logic.




5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.



I'm less sure that it's OK to cause temporary graph changes during the course 
of a blocking QMP function... but maybe that's OK?

Peter Krempa is the expert to consult on that one.


6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.

Vladimir Sementsov-Ogievskiy (2):
   qapi: block-dirty-bitmap-merge: support allocation maps
   iotests: add allocation-map-to-bitmap

  qapi/block-core.json  | 31 -
  include/block/block_int.h |  4 ++
  block/dirty-bitmap.c  | 42 
  block/monitor/bitmap-qmp-cmds.c   | 55 +---
  .../tests/allocation-map-to-bitmap    | 64 +++
  .../tests/allocation-map-to-bitmap.out    |  9 +++
  6 files changed, 195 insertions(+), 10 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
  create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out






--
Best regards,
Vladimir



Re: [PATCH v3 14/33] nbd: move connection code from block/nbd to nbd/client-connection

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:52AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We now have bs-independent connection API, which consists of four
> functions:
> 
>   nbd_client_connection_new()
>   nbd_client_connection_unref()
>   nbd_co_establish_connection()
>   nbd_co_establish_connection_cancel()
> 
> Move them to a separate file together with NBDClientConnection
> structure which becomes private to the new API.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/nbd.h |  11 +++
>  block/nbd.c | 187 ---
>  nbd/client-connection.c | 212 
>  nbd/meson.build |   1 +
>  4 files changed, 224 insertions(+), 187 deletions(-)
>  create mode 100644 nbd/client-connection.c
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 5f34d23bb0..57381be76f 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -406,4 +406,15 @@ const char *nbd_info_lookup(uint16_t info);
>  const char *nbd_cmd_lookup(uint16_t info);
>  const char *nbd_err_lookup(int err);
>  
> +/* nbd/client-connection.c */
> +typedef struct NBDClientConnection NBDClientConnection;
> +
> +NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr);
> +void nbd_client_connection_release(NBDClientConnection *conn);
> +
> +QIOChannelSocket *coroutine_fn
> +nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
> +
> +void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
> *conn);
> +
>  #endif
> diff --git a/block/nbd.c b/block/nbd.c
> index 8531d019b2..9bd68dcf10 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -66,24 +66,6 @@ typedef enum NBDClientState {
>  NBD_CLIENT_QUIT
>  } NBDClientState;
>  
> -typedef struct NBDClientConnection {
> -/* Initialization constants */
> -SocketAddress *saddr; /* address to connect to */
> -
> -/*
> - * Result of last attempt. Valid in FAIL and SUCCESS states.
> - * If you want to steal error, don't forget to set pointer to NULL.
> - */
> -QIOChannelSocket *sioc;
> -Error *err;
> -
> -QemuMutex mutex;
> -/* All further fields are protected by mutex */
> -bool running; /* thread is running now */
> -bool detached; /* thread is detached and should cleanup the state */
> -Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
> -} NBDClientConnection;
> -
>  typedef struct BDRVNBDState {
>  QIOChannelSocket *sioc; /* The master data channel */
>  QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
> @@ -118,12 +100,8 @@ typedef struct BDRVNBDState {
>  NBDClientConnection *conn;
>  } BDRVNBDState;
>  
> -static void nbd_client_connection_release(NBDClientConnection *conn);
>  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
> *saddr,
>  Error **errp);
> -static coroutine_fn QIOChannelSocket *
> -nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
> -static void nbd_co_establish_connection_cancel(NBDClientConnection *conn);
>  static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
>  static void nbd_yank(void *opaque);
>  
> @@ -340,171 +318,6 @@ static bool nbd_client_connecting_wait(BDRVNBDState *s)
>  return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
>  }
>  
> -static NBDClientConnection *
> -nbd_client_connection_new(const SocketAddress *saddr)
> -{
> -NBDClientConnection *conn = g_new(NBDClientConnection, 1);
> -
> -*conn = (NBDClientConnection) {
> -.saddr = QAPI_CLONE(SocketAddress, saddr),
> -};
> -
> -qemu_mutex_init(>mutex);
> -
> -return conn;
> -}
> -
> -static void nbd_client_connection_do_free(NBDClientConnection *conn)
> -{
> -if (conn->sioc) {
> -qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
> -object_unref(OBJECT(conn->sioc));
> -}
> -error_free(conn->err);
> -qapi_free_SocketAddress(conn->saddr);
> -g_free(conn);
> -}
> -
> -static void *connect_thread_func(void *opaque)
> -{
> -NBDClientConnection *conn = opaque;
> -bool do_free;
> -int ret;
> -
> -conn->sioc = qio_channel_socket_new();
> -
> -error_free(conn->err);
> -conn->err = NULL;
> -ret = qio_channel_socket_connect_sync(conn->sioc, conn->saddr, 
> >err);
> -if (ret < 0) {
> -object_unref(OBJECT(conn->sioc));
> -conn->sioc = NULL;
> -}
> -
> -qemu_mutex_lock(>mutex);
> -
> -assert(conn->running);
> -conn->running = false;
> -if (conn->wait_co) {
> -aio_co_schedule(NULL, conn->wait_co);
> -conn->wait_co = NULL;
> -}
> -do_free = conn->detached;
> -
> -qemu_mutex_unlock(>mutex);
> -
> -if (do_free) {
> -nbd_client_connection_do_free(conn);
> -}
> -
> -return NULL;
> -}
> -
> -static void nbd_client_connection_release(NBDClientConnection *conn)
> -{
> -

Re: [PATCH v3 13/33] block/nbd: introduce nbd_client_connection_release()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:51AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 43 ++-
>  1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 21a4039359..8531d019b2 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -118,7 +118,7 @@ typedef struct BDRVNBDState {
>  NBDClientConnection *conn;
>  } BDRVNBDState;
>  
> -static void nbd_free_connect_thread(NBDClientConnection *conn);
> +static void nbd_client_connection_release(NBDClientConnection *conn);
>  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
> *saddr,
>  Error **errp);
>  static coroutine_fn QIOChannelSocket *
> @@ -130,20 +130,9 @@ static void nbd_yank(void *opaque);
>  static void nbd_clear_bdrvstate(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
> -NBDClientConnection *conn = s->conn;
> -bool do_free;
> -
> -qemu_mutex_lock(>mutex);
> -if (conn->running) {
> -conn->detached = true;
> -}
> -do_free = !conn->running && !conn->detached;
> -qemu_mutex_unlock(>mutex);
>  
> -/* the runaway thread will clean it up itself */
> -if (do_free) {
> -nbd_free_connect_thread(conn);
> -}
> +nbd_client_connection_release(s->conn);
> +s->conn = NULL;
>  
>  yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name));
>  
> @@ -365,7 +354,7 @@ nbd_client_connection_new(const SocketAddress *saddr)
>  return conn;
>  }
>  
> -static void nbd_free_connect_thread(NBDClientConnection *conn)
> +static void nbd_client_connection_do_free(NBDClientConnection *conn)
>  {
>  if (conn->sioc) {
>  qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
> @@ -379,8 +368,8 @@ static void nbd_free_connect_thread(NBDClientConnection 
> *conn)
>  static void *connect_thread_func(void *opaque)
>  {
>  NBDClientConnection *conn = opaque;
> +bool do_free;
>  int ret;
> -bool do_free = false;
>  

This hunk belongs in patch 8.

Roman.

>  conn->sioc = qio_channel_socket_new();
>  
> @@ -405,12 +394,32 @@ static void *connect_thread_func(void *opaque)
>  qemu_mutex_unlock(>mutex);
>  
>  if (do_free) {
> -nbd_free_connect_thread(conn);
> +nbd_client_connection_do_free(conn);
>  }
>  
>  return NULL;
>  }
>  
> +static void nbd_client_connection_release(NBDClientConnection *conn)
> +{
> +bool do_free;
> +
> +if (!conn) {
> +return;
> +}
> +
> +qemu_mutex_lock(>mutex);
> +if (conn->running) {
> +conn->detached = true;
> +}
> +do_free = !conn->running && !conn->detached;
> +qemu_mutex_unlock(>mutex);
> +
> +if (do_free) {
> +nbd_client_connection_do_free(conn);
> +}
> +}
> +
>  /*
>   * Get a new connection in context of @conn:
>   *   if thread is running, wait for completion
> -- 
> 2.29.2
> 



Re: [PATCH v3 11/33] block/nbd: rename NBDConnectThread to NBDClientConnection

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:49AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are going to move connection code to own file and want clear names
> and APIs.
> 
> The structure is shared between user and (possibly) several runs of
> connect-thread. So it's wrong to call it "thread". Let's rename to
> something more generic.
> 
> Appropriately rename connect_thread and thr variables to conn.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 137 ++--
>  1 file changed, 68 insertions(+), 69 deletions(-)

Reviewed-by: Roman Kagan 



Re: [PATCH v3 08/33] block/nbd: drop thr->state

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:46AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We don't need all these states. The code refactored to use two boolean
> variables looks simpler.

Indeed.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 125 ++--
>  1 file changed, 34 insertions(+), 91 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index e1f39eda6c..2b26a033a4 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -66,24 +66,6 @@ typedef enum NBDClientState {
>  NBD_CLIENT_QUIT
>  } NBDClientState;
>  
> -typedef enum NBDConnectThreadState {
> -/* No thread, no pending results */
> -CONNECT_THREAD_NONE,
> -
> -/* Thread is running, no results for now */
> -CONNECT_THREAD_RUNNING,
> -
> -/*
> - * Thread is running, but requestor exited. Thread should close
> - * the new socket and free the connect state on exit.
> - */
> -CONNECT_THREAD_RUNNING_DETACHED,
> -
> -/* Thread finished, results are stored in a state */
> -CONNECT_THREAD_FAIL,
> -CONNECT_THREAD_SUCCESS
> -} NBDConnectThreadState;
> -
>  typedef struct NBDConnectThread {
>  /* Initialization constants */
>  SocketAddress *saddr; /* address to connect to */
> @@ -97,7 +79,8 @@ typedef struct NBDConnectThread {
>  
>  QemuMutex mutex;
>  /* All further fields are protected by mutex */
> -NBDConnectThreadState state; /* current state of the thread */
> +bool running; /* thread is running now */
> +bool detached; /* thread is detached and should cleanup the state */
>  Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
>  } NBDConnectThread;
>  
> @@ -147,17 +130,17 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
>  NBDConnectThread *thr = s->connect_thread;
> -bool thr_running;
> +bool do_free;
>  
>  qemu_mutex_lock(>mutex);
> -thr_running = thr->state == CONNECT_THREAD_RUNNING;
> -if (thr_running) {
> -thr->state = CONNECT_THREAD_RUNNING_DETACHED;
> +if (thr->running) {
> +thr->detached = true;
>  }
> +do_free = !thr->running && !thr->detached;

This is redundant.  You can unconditionally set ->detached and only
depend on ->running for the rest of this function.

>  qemu_mutex_unlock(>mutex);
>  
>  /* the runaway thread will clean it up itself */
> -if (!thr_running) {
> +if (do_free) {
>  nbd_free_connect_thread(thr);
>  }
>  
> @@ -373,7 +356,6 @@ static void nbd_init_connect_thread(BDRVNBDState *s)
>  
>  *s->connect_thread = (NBDConnectThread) {
>  .saddr = QAPI_CLONE(SocketAddress, s->saddr),
> -.state = CONNECT_THREAD_NONE,
>  };
>  
>  qemu_mutex_init(>connect_thread->mutex);
> @@ -408,20 +390,13 @@ static void *connect_thread_func(void *opaque)
>  
>  qemu_mutex_lock(>mutex);
>  
> -switch (thr->state) {
> -case CONNECT_THREAD_RUNNING:
> -thr->state = ret < 0 ? CONNECT_THREAD_FAIL : CONNECT_THREAD_SUCCESS;
> -if (thr->wait_co) {
> -aio_co_schedule(NULL, thr->wait_co);
> -thr->wait_co = NULL;
> -}
> -break;
> -case CONNECT_THREAD_RUNNING_DETACHED:
> -do_free = true;
> -break;
> -default:
> -abort();
> +assert(thr->running);
> +thr->running = false;
> +if (thr->wait_co) {
> +aio_co_schedule(NULL, thr->wait_co);
> +thr->wait_co = NULL;
>  }
> +do_free = thr->detached;
>  
>  qemu_mutex_unlock(>mutex);
>  
> @@ -435,36 +410,24 @@ static void *connect_thread_func(void *opaque)
>  static int coroutine_fn
>  nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
>  {
> -int ret;
>  QemuThread thread;
>  BDRVNBDState *s = bs->opaque;
>  NBDConnectThread *thr = s->connect_thread;
>  
> +assert(!s->sioc);
> +
>  qemu_mutex_lock(>mutex);
>  
> -switch (thr->state) {
> -case CONNECT_THREAD_FAIL:
> -case CONNECT_THREAD_NONE:
> +if (!thr->running) {
> +if (thr->sioc) {
> +/* Previous attempt finally succeeded in background */
> +goto out;
> +}
> +thr->running = true;
>  error_free(thr->err);
>  thr->err = NULL;
> -thr->state = CONNECT_THREAD_RUNNING;
>  qemu_thread_create(, "nbd-connect",
> connect_thread_func, thr, QEMU_THREAD_DETACHED);
> -break;
> -case CONNECT_THREAD_SUCCESS:
> -/* Previous attempt finally succeeded in background */
> -thr->state = CONNECT_THREAD_NONE;
> -s->sioc = thr->sioc;
> -thr->sioc = NULL;
> -yank_register_function(BLOCKDEV_YANK_INSTANCE(bs->node_name),
> -   nbd_yank, bs);
> -qemu_mutex_unlock(>mutex);
> -return 0;
> -case CONNECT_THREAD_RUNNING:
> -/* Already running, will 

Re: [PATCH v3 07/33] block/nbd: simplify waking of nbd_co_establish_connection()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:45AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Instead of connect_bh, bh_ctx and wait_connect fields we can live with
> only one link to waiting coroutine, protected by mutex.
> 
> So new logic is:
> 
> nbd_co_establish_connection() sets wait_co under mutex, release the
> mutex and do yield(). Note, that wait_co may be scheduled by thread
> immediately after unlocking the mutex. Still, in main thread (or
> iothread) we'll not reach the code for entering the coroutine until the
> yield() so we are safe.
> 
> Both connect_thread_func() and nbd_co_establish_connection_cancel() do
> the following to handle wait_co:
> 
> Under mutex, if thr->wait_co is not NULL, call aio_co_wake() (which
> never tries to acquire aio context since previous commit, so we are
> safe to do it under thr->mutex) and set thr->wait_co to NULL.
> This way we protect ourselves of scheduling it twice.
> 
> Also this commit make nbd_co_establish_connection() less connected to
> bs (we have generic pointer to the coroutine, not use s->connection_co
> directly). So, we are on the way of splitting connection API out of
> nbd.c (which is overcomplicated now).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 49 +
>  1 file changed, 9 insertions(+), 40 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index d67556c7ee..e1f39eda6c 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -87,12 +87,6 @@ typedef enum NBDConnectThreadState {
>  typedef struct NBDConnectThread {
>  /* Initialization constants */
>  SocketAddress *saddr; /* address to connect to */
> -/*
> - * Bottom half to schedule on completion. Scheduled only if bh_ctx is not
> - * NULL
> - */
> -QEMUBHFunc *bh_func;
> -void *bh_opaque;
>  
>  /*
>   * Result of last attempt. Valid in FAIL and SUCCESS states.
> @@ -101,10 +95,10 @@ typedef struct NBDConnectThread {
>  QIOChannelSocket *sioc;
>  Error *err;
>  
> -/* state and bh_ctx are protected by mutex */
>  QemuMutex mutex;
> +/* All further fields are protected by mutex */
>  NBDConnectThreadState state; /* current state of the thread */
> -AioContext *bh_ctx; /* where to schedule bh (NULL means don't schedule) 
> */
> +Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
>  } NBDConnectThread;
>  
>  typedef struct BDRVNBDState {
> @@ -138,7 +132,6 @@ typedef struct BDRVNBDState {
>  char *x_dirty_bitmap;
>  bool alloc_depth;
>  
> -bool wait_connect;
>  NBDConnectThread *connect_thread;
>  } BDRVNBDState;
>  
> @@ -374,15 +367,6 @@ static bool nbd_client_connecting_wait(BDRVNBDState *s)
>  return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
>  }
>  
> -static void connect_bh(void *opaque)
> -{
> -BDRVNBDState *state = opaque;
> -
> -assert(state->wait_connect);
> -state->wait_connect = false;
> -aio_co_wake(state->connection_co);
> -}
> -
>  static void nbd_init_connect_thread(BDRVNBDState *s)
>  {
>  s->connect_thread = g_new(NBDConnectThread, 1);
> @@ -390,8 +374,6 @@ static void nbd_init_connect_thread(BDRVNBDState *s)
>  *s->connect_thread = (NBDConnectThread) {
>  .saddr = QAPI_CLONE(SocketAddress, s->saddr),
>  .state = CONNECT_THREAD_NONE,
> -.bh_func = connect_bh,
> -.bh_opaque = s,
>  };
>  
>  qemu_mutex_init(>connect_thread->mutex);
> @@ -429,11 +411,9 @@ static void *connect_thread_func(void *opaque)
>  switch (thr->state) {
>  case CONNECT_THREAD_RUNNING:
>  thr->state = ret < 0 ? CONNECT_THREAD_FAIL : CONNECT_THREAD_SUCCESS;
> -if (thr->bh_ctx) {
> -aio_bh_schedule_oneshot(thr->bh_ctx, thr->bh_func, 
> thr->bh_opaque);
> -
> -/* play safe, don't reuse bh_ctx on further connection attempts 
> */
> -thr->bh_ctx = NULL;
> +if (thr->wait_co) {
> +aio_co_schedule(NULL, thr->wait_co);
> +thr->wait_co = NULL;
>  }
>  break;
>  case CONNECT_THREAD_RUNNING_DETACHED:
> @@ -487,20 +467,14 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
> **errp)
>  abort();
>  }
>  
> -thr->bh_ctx = qemu_get_current_aio_context();
> +thr->wait_co = qemu_coroutine_self();
>  
>  qemu_mutex_unlock(>mutex);
>  
> -
>  /*
>   * We are going to wait for connect-thread finish, but
>   * nbd_client_co_drain_begin() can interrupt.
> - *
> - * Note that wait_connect variable is not visible for connect-thread. It
> - * doesn't need mutex protection, it used only inside home aio context of
> - * bs.
>   */
> -s->wait_connect = true;
>  qemu_coroutine_yield();
>  
>  qemu_mutex_lock(>mutex);
> @@ -555,24 +529,19 @@ static void 
> nbd_co_establish_connection_cancel(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = bs->opaque;
>  NBDConnectThread *thr = 

Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread John Snow

On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
   <20200902181831.2570048-1-ebl...@redhat.com>
   https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
   https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.



Hm, possibly... I did not follow the discussion of coroutine QMP 
commands closely to know what the qualifying criteria to use them are.


(Any wisdom for me here, Markus?)


2. Transaction support is not here now. Will add in future version, if
general approach is OK.



That should be alright, I think. It means that the operation needs to 
succeed before the transaction returns success, though.


Depending on what else is in the transaction, do we run the risk of a 
deadlock if we need to wait for a coroutine to finish?



3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.



See above concern!


4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to _active_ bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.



That sounds fine for individual usage, but I can't convince myself it's 
safe for transactions.



5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.



I'm less sure that it's OK to cause temporary graph changes during the 
course of a blocking QMP function... but maybe that's OK?


Peter Krempa is the expert to consult on that one.


6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.

Vladimir Sementsov-Ogievskiy (2):
   qapi: block-dirty-bitmap-merge: support allocation maps
   iotests: add allocation-map-to-bitmap

  qapi/block-core.json  | 31 -
  include/block/block_int.h |  4 ++
  block/dirty-bitmap.c  | 42 
  block/monitor/bitmap-qmp-cmds.c   | 55 +---
  .../tests/allocation-map-to-bitmap| 64 +++
  .../tests/allocation-map-to-bitmap.out|  9 +++
  6 files changed, 195 insertions(+), 10 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
  create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out






Re: [PATCH] floppy: remove unused function fdctrl_format_sector

2021-04-27 Thread John Snow

On 3/14/21 3:53 AM, Hervé Poussineau wrote:

Le 12/03/2021 à 07:45, John Snow a écrit :

On 1/8/21 6:01 PM, Alexander Bulekov wrote:

fdctrl_format_sector was added in
baca51faff ("updated floppy driver: formatting code, disk geometry 
auto detect (Jocelyn Mayer)")


The single callsite is guarded by a check:
fdctrl->data_state & FD_STATE_FORMAT

However, the only place where the FD_STATE_FORMAT flag is set (in
fdctrl_handle_format_track) is closely followed by the same flag being
unset, with no possibility to call fdctrl_format_sector in between.



Hm, was this code *ever* used? It's hard to tell when we go back into 
the old SVN history.


Does this mean that fdctrl_handle_format_track is also basically an 
incomplete stub method?


I'm in favor of deleting bitrotted code, but I wonder if we should 
take a bigger bite.


--js


The fdctrl_format_sector has been added in SVN revision 671 
(baca51faff03df59386c95d9478ede18b5be5ec8), along with 
FD_STATE_FORMAT/FD_FORMAT_CMD.
As with current code, the only place where the FD_STATE_FORMAT flag was 
set (in fdctrl_handle_format_track) is closely followed by the same flag 
being unset, with no possibility to call fdctrl_format_sector in between.


I can however see the following comment:
    /* Bochs BIOS is buggy and don't send format informations
     * for each sector. So, pretend all's done right now...
     */
    fdctrl->data_state &= ~FD_STATE_FORMAT;

which was changed in SVN revision 2295 
(b92090309e5ff7154e4c131438ee2d540e233955) to:

    /* TODO: implement format using DMA expected by the Bochs BIOS
     * and Linux fdformat (read 3 bytes per sector via DMA and fill
     * the sector with the specified fill byte
     */

This probably means that code may have worked without DMA (to be 
confirmed), but was disabled since its introduction due to a problem 
with Bochs BIOS.

Later, fdformat was also tested and not working.

Since then, lots of work has also been done in DMA handling. I 
especially think at bb8f32c0318cb6c6e13e09ec0f35e21eff246413, which 
fixed a similar problem with floppy drives on IBM 40p machine.

What happens when this flag unsetting is removed? Does fdformat now works?

I think that we should either fix the code, or remove more code 
(everything related to fdctrl_format_sector, FD_STATE_FORMAT, 
FD_FORMAT_CMD, maybe even fdctrl_handle_format_track).


Regards,

Hervé



Alex, do you want to respin this following Hervé's suggestion for 
additional deletions?


I doubt anyone has the time or interest to actually FIX this code, so we 
may as well remove misleading code.


--js




[PATCH v2] fdc: fix floppy boot for Red Hat Linux 5.2

2021-04-27 Thread John Snow
The image size indicates it's an 81 track floppy disk image, which we
don't have a listing for in the geometry table. When you force the drive
type to 1.44MB, it guesses the reasonably close 18/80. When the drive
type is allowed to auto-detect or set to 2.88, it guesses a very
incorrect geometry.

auto, 144 and 288 drive types get the right geometry with the new entry
in the table.

Reported-by: Michael Tokarev 
Signed-off-by: John Snow 
Reviewed-by: Thomas Huth 

---

V2: I didn't actually stage this, so this is just a re-send to get a
fresh Message-ID to reference in the PR. Added Thomas's R-B.

 hw/block/fdc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acbae..0f0c716d878 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -122,6 +122,7 @@ static const FDFormat fd_formats[] = {
 /* First entry is default format */
 /* 1.44 MB 3"1/2 floppy disks */
 { FLOPPY_DRIVE_TYPE_144, 18, 80, 1, FDRIVE_RATE_500K, }, /* 3.5" 2880 */
+{ FLOPPY_DRIVE_TYPE_144, 18, 81, 1, FDRIVE_RATE_500K, },
 { FLOPPY_DRIVE_TYPE_144, 20, 80, 1, FDRIVE_RATE_500K, }, /* 3.5" 3200 */
 { FLOPPY_DRIVE_TYPE_144, 21, 80, 1, FDRIVE_RATE_500K, },
 { FLOPPY_DRIVE_TYPE_144, 21, 82, 1, FDRIVE_RATE_500K, },
-- 
2.30.2




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread John Snow

On 4/27/21 1:54 PM, Philippe Mathieu-Daudé wrote:

On 4/27/21 7:16 PM, John Snow wrote:

On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:

I suggest fixing this at the qdev level. Make piix3-ide have a
sub-device that inherits from ISA_DEVICE so it can only be instantiated
when there's an ISA bus.

Stefan


My qdev knowledge is shaky. Does this imply that you agree with the
direction of Thomas's patch, or do you just mean to disagree with Phil
on his preferred course of action?


My understanding is a disagreement to both, with a 3rd direction :)

I agree with Stefan direction but I'm not sure (yet) that a sub-device
is the best (long-term) solution. I guess there is a design issue with
this device, and would like to understanding it first.

IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
only allow a single parent. Multiple QOM inheritance is resolved as
interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
So he suggests to embed an IDE device within the PCI piix3-ide device.

My view is the PIIX is a chipset that share stuffs between components,
and the IDE bus belongs to the chipset PCI root (or eventually the
PCI-ISA bridge, function #0). The IDE function would use the IDE bus
from its root parent as a linked property.
My problem is currently this device is user-creatable as a Frankenstein
single PCI function, out of its chipset. I'm not sure yet this is a
dead end or I could work something out.

Regards,

Phil.



It sounds complicated. In the meantime, I think I am favor of taking 
Thomas's patch because it merely adds some error routing to allow us to 
avoid a crash. The core organizational issues of the IDE device(s) will 
remain and can be solved later as needed.


Do you agree?

--js




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread John Snow

On 4/16/21 8:52 AM, Thomas Huth wrote:

QEMU currently crashes when the user tries to do something like:

  qemu-system-x86_64 -M x-remote -device piix3-ide

This happens because the "isabus" variable is not initialized with
the x-remote machine yet. Add a proper check for this condition
and propagate the error to the caller, so we can fail there gracefully.

Signed-off-by: Thomas Huth 


Seems OK to me for now. I will trust that this won't get in the way of 
any deeper refactors Phil has planned, since this just adds error 
pathways to avoid something already broken, and doesn't change anything 
else.


I'm OK with that.

Reviewed-by: John Snow 

Tentatively staged, pending further discussion.


---
  hw/ide/ioport.c   | 16 ++--
  hw/ide/piix.c | 22 +-
  hw/isa/isa-bus.c  | 14 ++
  include/hw/ide/internal.h |  2 +-
  include/hw/isa/isa.h  | 13 -
  5 files changed, 46 insertions(+), 21 deletions(-)





diff --git a/hw/ide/ioport.c b/hw/ide/ioport.c
index b613ff3bba..e6caa537fa 100644
--- a/hw/ide/ioport.c
+++ b/hw/ide/ioport.c
@@ -50,15 +50,19 @@ static const MemoryRegionPortio ide_portio2_list[] = {
  PORTIO_END_OF_LIST(),
  };
  
-void ide_init_ioport(IDEBus *bus, ISADevice *dev, int iobase, int iobase2)

+int ide_init_ioport(IDEBus *bus, ISADevice *dev, int iobase, int iobase2)
  {
+int ret;
+
  /* ??? Assume only ISA and PCI configurations, and that the PCI-ISA
 bridge has been setup properly to always register with ISA.  */
-isa_register_portio_list(dev, >portio_list,
- iobase, ide_portio_list, bus, "ide");
+ret = isa_register_portio_list(dev, >portio_list,
+   iobase, ide_portio_list, bus, "ide");
  
-if (iobase2) {

-isa_register_portio_list(dev, >portio2_list,
- iobase2, ide_portio2_list, bus, "ide");
+if (ret == 0 && iobase2) {
+ret = isa_register_portio_list(dev, >portio2_list,
+   iobase2, ide_portio2_list, bus, "ide");
  }
+
+return ret;
  }



Little funky instead of just checking and returning after the first 
portio list registration, you could leave a little more git blame intact 
by not doing this, but...


...Then again, I just acked a ton of stuff Phil moved around, so, 
whatever O:-)





Re: [PATCH 2/4] hw/block/fdc: Declare shared prototypes in fdc-internal.h

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/27/21 6:53 PM, John Snow wrote:
> On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:
>> We want to extract ISA/SysBus code from the generic fdc.c file.
>> First, declare the prototypes we will access from the new units
>> into a new local header: "fdc-internal.h".
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>   hw/block/fdc-internal.h | 155 
>>   hw/block/fdc.c  | 126 +++-
>>   MAINTAINERS |   1 +
>>   3 files changed, 164 insertions(+), 118 deletions(-)
>>   create mode 100644 hw/block/fdc-internal.h
>>
> 
> With our policy of not including osdep.h in headers, it's hard to verify
> that this header is otherwise self-sufficient.
> 
> 
> I think the only thing it needs (not in osdep.h) happens to be MAX_FD. I
> added osdep.h just to test:
> 
> jsnow@scv ~/s/q/h/block (review)> gcc -I../../include/ -I../../bin/git
> -I/usr/lib64/glib-2.0/include -I/usr/include/glib-2.0 -c -o
> test_header.bin fdc-internal.h
> fdc-internal.h:134:19: error: ‘MAX_FD’ undeclared here (not in a function)
>   134 | FDrive drives[MAX_FD];
>   |   ^~
> 
> 
> Should we include the fdc header from the internal one?

Yes, good catch, will do.




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/27/21 7:16 PM, John Snow wrote:
> On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
>> I suggest fixing this at the qdev level. Make piix3-ide have a
>> sub-device that inherits from ISA_DEVICE so it can only be instantiated
>> when there's an ISA bus.
>>
>> Stefan
> 
> My qdev knowledge is shaky. Does this imply that you agree with the
> direction of Thomas's patch, or do you just mean to disagree with Phil
> on his preferred course of action?

My understanding is a disagreement to both, with a 3rd direction :)

I agree with Stefan direction but I'm not sure (yet) that a sub-device
is the best (long-term) solution. I guess there is a design issue with
this device, and would like to understanding it first.

IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
only allow a single parent. Multiple QOM inheritance is resolved as
interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
So he suggests to embed an IDE device within the PCI piix3-ide device.

My view is the PIIX is a chipset that share stuffs between components,
and the IDE bus belongs to the chipset PCI root (or eventually the
PCI-ISA bridge, function #0). The IDE function would use the IDE bus
from its root parent as a linked property.
My problem is currently this device is user-creatable as a Frankenstein
single PCI function, out of its chipset. I'm not sure yet this is a
dead end or I could work something out.

Regards,

Phil.




Re: [PATCH 4/4] hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the SysBus code.
Extract the SysBus specific code to a new unit: fdc-sysbus.c,
and add a new Kconfig symbol: "FDC_SYSBUS".

Signed-off-by: Philippe Mathieu-Daudé 


LGTM, but again you'll want someone to review the Kconfig changes. It 
looks reasonable to me at a glance, but I just don't know what I don't 
know there.


I'm trusting you somewhat that you've audited the proper dependencies 
for each subsystem and that these have been accurately described via 
Kconfig -- my knowledge of non-x86 platforms is a bit meager, so I am 
relying on CI to tell me if this breaks anything I care about.


Would love to get an ACK from Mark Cave-Ayland and Hervé Poussineau if 
possible, but if they're not available to take a quick peek, we'll try 
to get this in closer to the beginning of the dev window to maximize 
problem-finding time.


Reviewed-by: John Snow 


---
  hw/block/fdc-sysbus.c | 252 ++
  hw/block/fdc.c| 220 
  MAINTAINERS   |   1 +
  hw/block/Kconfig  |   4 +
  hw/block/meson.build  |   1 +
  hw/block/trace-events |   2 +
  hw/mips/Kconfig   |   2 +-
  hw/sparc/Kconfig  |   2 +-
  8 files changed, 262 insertions(+), 222 deletions(-)
  create mode 100644 hw/block/fdc-sysbus.c

diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
new file mode 100644
index 000..71755fd6ae4
--- /dev/null
+++ b/hw/block/fdc-sysbus.c
@@ -0,0 +1,252 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qom/object.h"
+#include "hw/sysbus.h"
+#include "hw/block/fdc.h"
+#include "migration/vmstate.h"
+#include "fdc-internal.h"
+#include "trace.h"
+
+#define TYPE_SYSBUS_FDC "base-sysbus-fdc"
+typedef struct FDCtrlSysBusClass FDCtrlSysBusClass;
+typedef struct FDCtrlSysBus FDCtrlSysBus;
+DECLARE_OBJ_CHECKERS(FDCtrlSysBus, FDCtrlSysBusClass,
+ SYSBUS_FDC, TYPE_SYSBUS_FDC)
+
+struct FDCtrlSysBusClass {
+/*< private >*/
+SysBusDeviceClass parent_class;
+/*< public >*/
+
+bool use_strict_io;
+};
+
+struct FDCtrlSysBus {
+/*< private >*/
+SysBusDevice parent_obj;
+/*< public >*/
+
+struct FDCtrl state;
+};
+
+static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
+{
+return fdctrl_read(opaque, (uint32_t)reg);
+}
+
+static void fdctrl_write_mem(void *opaque, hwaddr reg,
+ uint64_t value, unsigned size)
+{
+fdctrl_write(opaque, (uint32_t)reg, value);
+}
+
+static const MemoryRegionOps fdctrl_mem_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static const MemoryRegionOps fdctrl_mem_strict_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
+static void fdctrl_external_reset_sysbus(DeviceState *d)
+{
+FDCtrlSysBus *sys = SYSBUS_FDC(d);
+FDCtrl *s = >state;
+
+fdctrl_reset(s, 0);
+}
+
+static void fdctrl_handle_tc(void *opaque, int irq, int level)
+{
+trace_fdctrl_tc_pulse(level);
+}
+
+void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
+hwaddr mmio_base, DriveInfo **fds)
+{
+FDCtrl *fdctrl;
+DeviceState *dev;
+SysBusDevice *sbd;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sysbus-fdc");
+sys = SYSBUS_FDC(dev);
+fdctrl = >state;
+fdctrl->dma_chann = dma_chann; /* FIXME */
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, _fatal);
+

Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Hi,

The floppy disc controllers pulls in irrelevant devices (sysbus in
an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).

This series clean that by extracting each device in its own file,
adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.

Regards,

Phil.



Lightly reviewed and I'm fine with it overall; but want a quick ~5min 
up-or-down by Hervé and/or Mark (To make sure this doesn't break any 
non-x86 system they may care about), and a quick nod from Paolo for 
KConfig changes would be nice.


I'll be waiting on a reply to my question on patch 01 before staging.

--js


Philippe Mathieu-Daudé (4):
   hw/block/fdc: Replace disabled fprintf() by trace event
   hw/block/fdc: Declare shared prototypes in fdc-internal.h
   hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
   hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

  hw/block/fdc-internal.h | 155 ++
  hw/block/fdc-isa.c  | 313 +
  hw/block/fdc-sysbus.c   | 252 +
  hw/block/fdc.c  | 608 +---
  MAINTAINERS |   3 +
  hw/block/Kconfig|   8 +
  hw/block/meson.build|   2 +
  hw/block/trace-events   |   3 +
  hw/i386/Kconfig |   2 +-
  hw/isa/Kconfig  |   6 +-
  hw/mips/Kconfig |   2 +-
  hw/sparc/Kconfig|   2 +-
  hw/sparc64/Kconfig  |   2 +-
  13 files changed, 751 insertions(+), 607 deletions(-)
  create mode 100644 hw/block/fdc-internal.h
  create mode 100644 hw/block/fdc-isa.c
  create mode 100644 hw/block/fdc-sysbus.c






Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread John Snow

On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:

I suggest fixing this at the qdev level. Make piix3-ide have a
sub-device that inherits from ISA_DEVICE so it can only be instantiated
when there's an ISA bus.

Stefan


My qdev knowledge is shaky. Does this imply that you agree with the 
direction of Thomas's patch, or do you just mean to disagree with Phil 
on his preferred course of action?


--js




Re: [PATCH 3/4] hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the ISA code.
Extract the ISA specific code to a new unit: fdc-isa.c, and
add a new Kconfig symbol: "FDC_ISA".

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/block/fdc-isa.c   | 313 +++
  hw/block/fdc.c   | 257 ---
  MAINTAINERS  |   1 +
  hw/block/Kconfig |   4 +
  hw/block/meson.build |   1 +
  hw/i386/Kconfig  |   2 +-
  hw/isa/Kconfig   |   6 +-
  hw/sparc64/Kconfig   |   2 +-
  8 files changed, 324 insertions(+), 262 deletions(-)
  create mode 100644 hw/block/fdc-isa.c



LGTM but you'll want someone else to review the Kconfig changes.

Reviewed-by: John Snow 




Re: [PATCH 2/4] hw/block/fdc: Declare shared prototypes in fdc-internal.h

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

We want to extract ISA/SysBus code from the generic fdc.c file.
First, declare the prototypes we will access from the new units
into a new local header: "fdc-internal.h".

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/block/fdc-internal.h | 155 
  hw/block/fdc.c  | 126 +++-
  MAINTAINERS |   1 +
  3 files changed, 164 insertions(+), 118 deletions(-)
  create mode 100644 hw/block/fdc-internal.h



With our policy of not including osdep.h in headers, it's hard to verify 
that this header is otherwise self-sufficient.



I think the only thing it needs (not in osdep.h) happens to be MAX_FD. I 
added osdep.h just to test:


jsnow@scv ~/s/q/h/block (review)> gcc -I../../include/ -I../../bin/git 
-I/usr/lib64/glib-2.0/include -I/usr/include/glib-2.0 -c -o 
test_header.bin fdc-internal.h

fdc-internal.h:134:19: error: ‘MAX_FD’ undeclared here (not in a function)
  134 | FDrive drives[MAX_FD];
  |   ^~


Should we include the fdc header from the internal one?


diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
new file mode 100644
index 000..ddd41461ff3
--- /dev/null
+++ b/hw/block/fdc-internal.h
@@ -0,0 +1,155 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#ifndef HW_BLOCK_FDC_INTERNAL_H
+#define HW_BLOCK_FDC_INTERNAL_H
+
+#include "exec/memory.h"
+#include "exec/ioport.h"
+#include "hw/block/block.h"
+#include "qapi/qapi-types-block.h"
+
+typedef struct FDCtrl FDCtrl;
+
+/* Floppy bus emulation */
+
+typedef struct FloppyBus {
+BusState bus;
+FDCtrl *fdc;
+} FloppyBus;
+
+/* Floppy disk drive emulation */
+
+typedef enum FDriveRate {
+FDRIVE_RATE_500K = 0x00,  /* 500 Kbps */
+FDRIVE_RATE_300K = 0x01,  /* 300 Kbps */
+FDRIVE_RATE_250K = 0x02,  /* 250 Kbps */
+FDRIVE_RATE_1M   = 0x03,  /*   1 Mbps */
+} FDriveRate;
+
+typedef enum FDriveSize {
+FDRIVE_SIZE_UNKNOWN,
+FDRIVE_SIZE_350,
+FDRIVE_SIZE_525,
+} FDriveSize;
+
+typedef struct FDFormat {
+FloppyDriveType drive;
+uint8_t last_sect;
+uint8_t max_track;
+uint8_t max_head;
+FDriveRate rate;
+} FDFormat;
+
+typedef enum FDiskFlags {
+FDISK_DBL_SIDES  = 0x01,
+} FDiskFlags;
+
+typedef struct FDrive {
+FDCtrl *fdctrl;
+BlockBackend *blk;
+BlockConf *conf;
+/* Drive status */
+FloppyDriveType drive;/* CMOS drive type*/
+uint8_t perpendicular;/* 2.88 MB access mode*/
+/* Position */
+uint8_t head;
+uint8_t track;
+uint8_t sect;
+/* Media */
+FloppyDriveType disk; /* Current disk type  */
+FDiskFlags flags;
+uint8_t last_sect;/* Nb sector per track*/
+uint8_t max_track;/* Nb of tracks   */
+uint16_t bps; /* Bytes per sector   */
+uint8_t ro;   /* Is read-only   */
+uint8_t media_changed;/* Is media changed   */
+uint8_t media_rate;   /* Data rate of medium*/
+
+bool media_validated; /* Have we validated the media? */
+} FDrive;
+
+struct FDCtrl {
+MemoryRegion iomem;
+qemu_irq irq;
+/* Controller state */
+QEMUTimer *result_timer;
+int dma_chann;
+uint8_t phase;
+IsaDma *dma;
+/* Controller's identification */
+uint8_t version;
+/* HW */
+uint8_t sra;
+uint8_t srb;
+uint8_t dor;
+uint8_t dor_vmstate; /* only used as temp during vmstate */
+uint8_t tdr;
+uint8_t dsr;
+uint8_t msr;
+uint8_t cur_drv;
+uint8_t status0;
+uint8_t status1;
+uint8_t status2;
+/* Command FIFO */
+uint8_t *fifo;
+int32_t fifo_size;
+uint32_t data_pos;
+uint32_t data_len;
+uint8_t 

Re: [PATCH 1/4] hw/block/fdc: Replace disabled fprintf() by trace event

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/block/fdc.c| 7 +--
  hw/block/trace-events | 1 +
  2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acbae..1d3a0473678 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -1242,12 +1242,7 @@ static void fdctrl_external_reset_isa(DeviceState *d)
  
  static void fdctrl_handle_tc(void *opaque, int irq, int level)

  {
-//FDCtrl *s = opaque;
-
-if (level) {
-// XXX
-FLOPPY_DPRINTF("TC pulsed\n");
-}
+trace_fdctrl_tc_pulse(level);
  }
  


Do we need this function to fulfill some specific callback signature? 
... Ah, yeah, I guess for qdev_init_gpio_in. OK.



  /* Change IRQ state */
diff --git a/hw/block/trace-events b/hw/block/trace-events
index fa12e3a67a7..306989c193c 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -3,6 +3,7 @@
  # fdc.c
  fdc_ioport_read(uint8_t reg, uint8_t value) "read reg 0x%02x val 0x%02x"
  fdc_ioport_write(uint8_t reg, uint8_t value) "write reg 0x%02x val 0x%02x"
+fdctrl_tc_pulse(int level) "TC pulse: %u"
  
  # pflash_cfi01.c

  # pflash_cfi02.c



Reviewed-by: John Snow 




Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands

2021-04-27 Thread Cédric Le Goater
Hello,

On 4/27/21 10:54 AM, Francisco Iglesias wrote:
> On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote:
>> On Fri, Apr 23, 2021 at 4:46 PM Bin Meng  wrote:
>>>
>>> On Mon, Feb 8, 2021 at 10:41 PM Bin Meng  wrote:

 On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
  wrote:
>
> Hi Bin,
>
> On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
>> Hi Francisco,
>>
>> On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
>>  wrote:
>>>
>>> Dear Bin,
>>>
>>> On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
 Hi Francisco,

 On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
  wrote:
>
> Hi Bin,
>
> On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
>> Hi Francisco,
>>
>> On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
>>  wrote:
>>>
>>> Hi Bin,
>>>
>>> On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
 Hi Francisco,

 On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
  wrote:
>
> Hi Bin,
>
> On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
>> Hi Francisco,
>>
>> On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
>>  wrote:
>>>
>>> Hi Bin,
>>>
>>> On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
 From: Bin Meng 

 The m25p80 model uses s->needed_bytes to indicate how many 
 follow-up
 bytes are expected to be received after it receives a command. 
 For
 example, depending on the address mode, either 3-byte address 
 or
 4-byte address is needed.

 For fast read family commands, some dummy cycles are required 
 after
 sending the address bytes, and the dummy cycles need to be 
 counted
 in s->needed_bytes. This is where the mess began.

 As the variable name (needed_bytes) indicates, the unit is in 
 byte.
 It is not in bit, or cycle. However for some reason the model 
 has
 been using the number of dummy cycles for s->needed_bytes. The 
 right
 approach is to convert the number of dummy cycles to bytes 
 based on
 the SPI protocol, for example, 6 dummy cycles for the Fast 
 Read Quad
 I/O (EBh) should be converted to 3 bytes per the formula (6 * 
 4 / 8).
>>>
>>> While not being the original implementor I must assume that 
>>> above solution was
>>> considered but not chosen by the developers due to it is 
>>> inaccuracy (it
>>> wouldn't be possible to model exacly 6 dummy cycles, only a 
>>> multiple of 8,
>>> meaning that if the controller is wrongly programmed to 
>>> generate 7 the error
>>> wouldn't be caught and the controller will still be considered 
>>> "correct"). Now
>>> that we have this detail in the implementation I'm in favor of 
>>> keeping it, this
>>> also because the detail is already in use for catching exactly 
>>> above error.
>>>
>>
>> I found no clue from the commit message that my proposed 
>> solution here
>> was ever considered, otherwise all SPI controller models 
>> supporting
>> software generation should have been found out seriously broken 
>> long
>> time ago!
>
>
> The controllers you are referring to might lack support for 
> commands requiring
> dummy clock cycles but I really hope they work with the other 
> commands? If so I

 I am not sure why you view dummy clock cycles as something special
 that needs some special support from the SPI controller. For the 
 case
 1 controller, it's nothing special from the controller perspective,
 just like sending out a command, or address bytes, or data. The
 controller just shifts data bit by bit from its tx fifo and that's 
 it.
 In the Xilinx GQSPI controller case, the dummy cycles can either be
 sent via a regular data (the case 1 controller) in the tx fifo, or
 automatically generated (case 2 controller) by the hardware.
>>>
>>> Ok, I'll try to explain my view point a little differently. For 
>>> that we also
>>> 

Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread Stefan Hajnoczi
On Tue, Apr 27, 2021 at 03:27:40PM +0200, Philippe Mathieu-Daudé wrote:
> 2/ There is no Kconfig dependency IDE_PIIX -> ISA_BUS, apparently
>we need it.

Adding an ISA_BUS Kconfig dependency won't solve the problem since the
qemu-system-x86_64 binary is built with many machine types. Most of then
include ISA_BUS so the IDE_PIIX device will be linked in and available
when -M x-remote is used. The crash will still occur.

> Anyhow I think it would be easier to manage the ISA code if the bus
> were created explicitly, not under our feet. I have a WIP series
> doing that, but it isn't easy (as with all very old QEMU code/devices).

I suggest fixing this at the qdev level. Make piix3-ide have a
sub-device that inherits from ISA_DEVICE so it can only be instantiated
when there's an ISA bus.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread Philippe Mathieu-Daudé
Hi Thomas,

On 4/16/21 2:52 PM, Thomas Huth wrote:
> QEMU currently crashes when the user tries to do something like:
> 
>  qemu-system-x86_64 -M x-remote -device piix3-ide
> 
> This happens because the "isabus" variable is not initialized with
> the x-remote machine yet.

The qdev contract is everything must be ready at realize() time,
so your safe check in pci_piix_ide_realize() seems alright.

But something doesn't sound right... If no ISA bus is present,
we shouldn't have executed so many code, rather have bailed out
earlier.

How does it work with the x-remote machine? The bus will become
available later upon remote connection?

piix3-ide is a PCI function device. Personally I consider it part
of the PIIX3 chipset, but we prefer to instantiate it separately.
So per Kconfig, piix3-ide is user-creatable by any machine providing
a PCI bus.

See hw/ide/Kconfig:

config IDE_CORE
bool

config IDE_PCI
bool
depends on PCI
select IDE_CORE

config IDE_ISA
bool
depends on ISA_BUS
select IDE_QDEV

config IDE_PIIX
bool
select IDE_PCI
select IDE_QDEV

and x-remote machine:

config MULTIPROCESS
bool
depends on PCI && PCI_EXPRESS && KVM
select REMOTE_PCIHOST

1/ This KVM check is dubious, and isn't enforced at runtime
   in the machine.

2/ There is no Kconfig dependency IDE_PIIX -> ISA_BUS, apparently
   we need it.


Anyhow I think it would be easier to manage the ISA code if the bus
were created explicitly, not under our feet. I have a WIP series
doing that, but it isn't easy (as with all very old QEMU code/devices).

> Add a proper check for this condition
> and propagate the error to the caller, so we can fail there gracefully.
> 
> Signed-off-by: Thomas Huth 
> ---
>  hw/ide/ioport.c   | 16 ++--
>  hw/ide/piix.c | 22 +-
>  hw/isa/isa-bus.c  | 14 ++
>  include/hw/ide/internal.h |  2 +-
>  include/hw/isa/isa.h  | 13 -
>  5 files changed, 46 insertions(+), 21 deletions(-)

>  static void pci_piix_ide_realize(PCIDevice *dev, Error **errp)
>  {
>  PCIIDEState *d = PCI_IDE(dev);
>  uint8_t *pci_conf = dev->config;
> +int rc;
>  
>  pci_conf[PCI_CLASS_PROG] = 0x80; // legacy ATA mode
>  
> @@ -158,7 +166,11 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error 
> **errp)
>  
>  vmstate_register(VMSTATE_IF(dev), 0, _ide_pci, d);
>  
> -pci_piix_init_ports(d);
> +rc = pci_piix_init_ports(d);
> +if (rc) {
> +error_setg_errno(errp, -rc, "Failed to realize %s",
> + object_get_typename(OBJECT(dev)));
> +}
>  }




Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-27 Thread Philippe Mathieu-Daudé
ping?

On 4/15/21 12:23 PM, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> The floppy disc controllers pulls in irrelevant devices (sysbus in
> an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).
> 
> This series clean that by extracting each device in its own file,
> adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.
> 
> Regards,
> 
> Phil.
> 
> Philippe Mathieu-Daudé (4):
>   hw/block/fdc: Replace disabled fprintf() by trace event
>   hw/block/fdc: Declare shared prototypes in fdc-internal.h
>   hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
>   hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c
> 
>  hw/block/fdc-internal.h | 155 ++
>  hw/block/fdc-isa.c  | 313 +
>  hw/block/fdc-sysbus.c   | 252 +
>  hw/block/fdc.c  | 608 +---
>  MAINTAINERS |   3 +
>  hw/block/Kconfig|   8 +
>  hw/block/meson.build|   2 +
>  hw/block/trace-events   |   3 +
>  hw/i386/Kconfig |   2 +-
>  hw/isa/Kconfig  |   6 +-
>  hw/mips/Kconfig |   2 +-
>  hw/sparc/Kconfig|   2 +-
>  hw/sparc64/Kconfig  |   2 +-
>  13 files changed, 751 insertions(+), 607 deletions(-)
>  create mode 100644 hw/block/fdc-internal.h
>  create mode 100644 hw/block/fdc-isa.c
>  create mode 100644 hw/block/fdc-sysbus.c
> 



Re: [PATCH] qapi: deprecate drive-backup

2021-04-27 Thread Kashyap Chamarthy
On Mon, Apr 26, 2021 at 01:34:17PM -0400, John Snow wrote:
> On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:
> > Modern way is using blockdev-add + blockdev-backup, which provides a
> > lot more control on how target is opened.

[...]

> 1) Let's add a sphinx reference to
> https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup
> 
> 
> 2) Just a thought, not a request: We also may wish to update
> https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the
> new, preferred method. However, this doc is a bit old and is in need
> of an overhaul anyway (Especially to add the NBD pull workflow.) Since
> the doc is in need of an overhaul anyway, can we ask Kashyap to help
> us here, if he has time?

Yes, I should be able to make time and help here; been putting it off on
the back burner.  Thanks for the reminder.  :) I'd like to update both
these:

https://qemu-project.gitlab.io/qemu/interop/bitmaps.html
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html

Both of them, as you know, refer to 'drive-backup'.  They also need
other adjustments / additions.  Also perhaps break the larger doc into a
couple of smaller ones.

I'll start working on it the end of this week.  First I need to tinker
with some of the recent improvements to refresh my memory and get a
sense of the modifications involved.  So please bear with me.

> 3) Let's add a small explanation here that outlines the differences in
> using these two commands. Here's a suggestion:
> 
> This change primarily separates the creation/opening process of the
> backup target with explicit, separate steps. BlockdevBackup uses
> mostly the same arguments as DriveBackup, except the "format" and
> "mode" options are removed in favor of using explicit
> "blockdev-create" and "blockdev-add" calls.
> 
> The "target" argument changes semantics. It no longer accepts
> filenames, and will now additionally accept arbitrary node names in
> addition to device names.

Yeah; this is something I had figure out by some trial-and-error when I
was playing with it in the past.

> 4) Also not a request: If we want to go above and beyond, it might be nice
> to spell out the exact steps required to transition from the old interface
> to the new one. Here's a (hasty) suggestion for how that might look:
> 
> - The MODE argument is deprecated.
>   - "existing" is replaced by using "blockdev-add" commands.
>   - "absolute-paths" is replaced by using "blockdev-add" and
> "blockdev-create" commands.
> 
> - The FORMAT argument is deprecated.
>   - Format information is given to "blockdev-add"/"blockdev-create".
> 
> - The TARGET argument has new semantics:
>   - Filenames are no longer supported, use blockdev-add/blockdev-create
> as necessary instead.
>   - Device targets remain supported.

Agreed; docs like these can be useful for management tools that rely on
the old behaviour and are not as up-to-date as libvirt in keeping track
of QEMU developments.

[...]

-- 
/kashyap




Re: [PATCH 1/2] qapi: block-dirty-bitmap-merge: support allocation maps

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

27.04.2021 14:11, Vladimir Sementsov-Ogievskiy wrote:

Add possibility to merge allocation map of specified node into target
bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy
---
  qapi/block-core.json| 31 +--
  include/block/block_int.h   |  4 +++
  block/dirty-bitmap.c| 42 +
  block/monitor/bitmap-qmp-cmds.c | 55 -
  4 files changed, 122 insertions(+), 10 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6d227924d0..0fafb043bc 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2006,6 +2006,32 @@
'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
  '*persistent': 'bool', '*disabled': 'bool' } }
  
+##

+# @AllocationMapMode:
+#
+# An enumeration of possible allocation maps that could be merged into target
+# bitmap.
+#
+# @top: The allocation status of the top layer of the attached storage node.
+#
+# Since: 6.1
+##
+{ 'enum': 'AllocationMapMode',
+  'data': ['top'] }
+
+##
+# @BlockDirtyBitmapMergeExternalSource:
+#
+# @node: name of device/node which the bitmap is tracking
+#
+# @name: name of the dirty bitmap
+#
+# Since: 6.1
+##
+{ 'struct': 'BlockDirtyBitmapMergeExternalSource',
+  'data': { 'node': 'str', '*name': 'str',
+'*allocation-map': 'AllocationMapMode' } }
+
  ##
  # @BlockDirtyBitmapMergeSource:
  #
@@ -2017,7 +2043,7 @@
  ##
  { 'alternate': 'BlockDirtyBitmapMergeSource',
'data': { 'local': 'str',
-'external': 'BlockDirtyBitmap' } }
+'external': 'BlockDirtyBitmapMergeExternalSource' } }
  
  ##

  # @BlockDirtyBitmapMerge:
@@ -2176,7 +2202,8 @@
  #
  ##
  { 'command': 'block-dirty-bitmap-merge',
-  'data': 'BlockDirtyBitmapMerge' }
+  'data': 'BlockDirtyBitmapMerge',
+  'coroutine': true }
  
  ##



So, what I propose makes possible to issue the following command:

block-dirty-bitmap-merge
  bitmaps: [{"allocation-map": "top", "node": "drive0"}]
  node: target-node-name
  target: target-bitmap-name


I've discussed it with Nikolay, and he requested a possibility of querying 
allocation status of base..top sub-chain (assume several snapshots was done 
without creating bitmaps, and we want to restore the bitmap for backup).

So, we actually want something like

block-dirty-bitmap-merge
  bitmaps: [{top: "tpp-node-name", bottom: "bottom-node-name"}]
  node: target-node-name
  target: target-bitmap-name


--
Best regards,
Vladimir



Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

27.04.2021 14:11, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
   <20200902181831.2570048-1-ebl...@redhat.com>
   https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
   https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.

2. Transaction support is not here now. Will add in future version, if
general approach is OK.


Ha actually, I think it should work, as block-dirty-bitmap-merge is already 
transactional, and we don't break it by the commit.



3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.

4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to_active_  bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.

5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.

6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.


8. Name "bitmaps" for list of sources in block-dirty-bitmap-merge becomes misleading. Probably we should add 
similar "sources" list as alternative to "bitmaps", and deprecate "bitmaps" field.


--
Best regards,
Vladimir



Re: [PATCH v3 22/36] block: add bdrv_remove_filter_or_cow transaction action

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

27.04.2021 14:09, Kevin Wolf wrote:

Am 26.04.2021 um 19:18 hat Vladimir Sementsov-Ogievskiy geschrieben:

26.04.2021 19:26, Kevin Wolf wrote:

Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   block.c | 78 +++--
   1 file changed, 76 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 11f7ad0818..2fca1f2ad5 100644
--- a/block.c
+++ b/block.c
@@ -2929,12 +2929,19 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
   }
   }
+static void bdrv_child_free(void *opaque)
+{
+BdrvChild *c = opaque;
+
+g_free(c->name);
+g_free(c);
+}
+
   static void bdrv_remove_empty_child(BdrvChild *child)
   {
   assert(!child->bs);
   QLIST_SAFE_REMOVE(child, next);
-g_free(child->name);
-g_free(child);
+bdrv_child_free(child);
   }
   typedef struct BdrvAttachChildCommonState {
@@ -4956,6 +4963,73 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
   return ret;
   }
+typedef struct BdrvRemoveFilterOrCowChild {
+BdrvChild *child;
+bool is_backing;
+} BdrvRemoveFilterOrCowChild;
+
+/* this doesn't restore original child bs, only the child itself */


Hm, this comment tells me that it's intentional, but why is it correct?


that's because bdrv_remove_filter_or_cow_child_abort() aborts only
part of  bdrv_remove_filter_or_cow_child().


I see that it aborts only part of it, but why?

Normally, a function getting a Transaction object suggests to me that on
failure, all changes the function made will be reverted, not just parts
of the changes.


Look: bdrv_remove_filter_or_cow_child() firstly do
bdrv_replace_child_safe(child, NULL, tran);, so bs would be restored
by .abort() of bdrv_replace_child_safe() action.


Ah! So the reason is not that we don't want to restore child->bs, but
that bdrv_replace_child_safe() is already transactionable and will
automatically do this.


So, improved comment may look like:

This doesn't restore original child bs, only the child itself. The bs
would be restored by .abort() bdrv_replace_child_safe() subation of
bdrv_remove_filter_or_cow_child() action.


"subation" is a typo for "subaction"?

Maybe something like this:

 We don't have to restore child->bs here to undo bdrv_replace_child()
 because that function is already transactionable and will do so in
 its own .abort() callback.


Sounds good, thanks




Probably it would be more correct to rename

BdrvRemoveFilterOrCowChild -> BdrvRemoveFilterOrCowChildNoBs
bdrv_remove_filter_or_cow_child_abort -> 
bdrv_remove_filter_or_cow_child_no_bs_abort
bdrv_remove_filter_or_cow_child_commit -> 
bdrv_remove_filter_or_cow_child_no_bs_commit

and assert on .abort() and .commit() that s->child->bs is NULL.


I wouldn't bother with that. It was just that the comment confused me
because it seemed to suggest that we actually don't want to restore
child->bs, when its real intention was to say that it's already restored
somewhere else.



OK


--
Best regards,
Vladimir



[PATCH 1/2] qapi: block-dirty-bitmap-merge: support allocation maps

2021-04-27 Thread Vladimir Sementsov-Ogievskiy
Add possibility to merge allocation map of specified node into target
bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json| 31 +--
 include/block/block_int.h   |  4 +++
 block/dirty-bitmap.c| 42 +
 block/monitor/bitmap-qmp-cmds.c | 55 -
 4 files changed, 122 insertions(+), 10 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6d227924d0..0fafb043bc 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2006,6 +2006,32 @@
   'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
 '*persistent': 'bool', '*disabled': 'bool' } }
 
+##
+# @AllocationMapMode:
+#
+# An enumeration of possible allocation maps that could be merged into target
+# bitmap.
+#
+# @top: The allocation status of the top layer of the attached storage node.
+#
+# Since: 6.1
+##
+{ 'enum': 'AllocationMapMode',
+  'data': ['top'] }
+
+##
+# @BlockDirtyBitmapMergeExternalSource:
+#
+# @node: name of device/node which the bitmap is tracking
+#
+# @name: name of the dirty bitmap
+#
+# Since: 6.1
+##
+{ 'struct': 'BlockDirtyBitmapMergeExternalSource',
+  'data': { 'node': 'str', '*name': 'str',
+'*allocation-map': 'AllocationMapMode' } }
+
 ##
 # @BlockDirtyBitmapMergeSource:
 #
@@ -2017,7 +2043,7 @@
 ##
 { 'alternate': 'BlockDirtyBitmapMergeSource',
   'data': { 'local': 'str',
-'external': 'BlockDirtyBitmap' } }
+'external': 'BlockDirtyBitmapMergeExternalSource' } }
 
 ##
 # @BlockDirtyBitmapMerge:
@@ -2176,7 +2202,8 @@
 #
 ##
 { 'command': 'block-dirty-bitmap-merge',
-  'data': 'BlockDirtyBitmapMerge' }
+  'data': 'BlockDirtyBitmapMerge',
+  'coroutine': true }
 
 ##
 # @BlockDirtyBitmapSha256:
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 88e4111939..b5aeaef425 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1361,6 +1361,10 @@ bool bdrv_dirty_bitmap_merge_internal(BdrvDirtyBitmap 
*dest,
   const BdrvDirtyBitmap *src,
   HBitmap **backup, bool lock);
 
+int bdrv_merge_allocation_top_to_dirty_bitmap(BdrvDirtyBitmap *dest,
+  BlockDriverState *bs,
+  Error **errp);
+
 void bdrv_inc_in_flight(BlockDriverState *bs);
 void bdrv_dec_in_flight(BlockDriverState *bs);
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 68d295d6e3..78097e30c5 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -914,6 +914,48 @@ out:
 }
 }
 
+int bdrv_merge_allocation_top_to_dirty_bitmap(BdrvDirtyBitmap *dest,
+  BlockDriverState *bs,
+  Error **errp)
+{
+int ret;
+int64_t offset = 0;
+int64_t bytes = bdrv_getlength(bs);
+
+if (bytes < 0) {
+error_setg(errp, "Failed to get length of node '%s'",
+   bdrv_get_node_name(bs));
+return bytes;
+}
+
+if (bdrv_dirty_bitmap_size(dest) < bytes) {
+error_setg(errp, "Bitmap is smaller than node '%s'",
+   bdrv_get_node_name(bs));
+return -EINVAL;
+}
+
+while (bytes) {
+int64_t cur_bytes = bytes;
+
+ret = bdrv_is_allocated(bs, offset, cur_bytes, _bytes);
+if (ret < 0) {
+error_setg(errp,
+   "Failed to get block allocation status of node '%s'",
+   bdrv_get_node_name(bs));
+return ret;
+}
+
+if (ret) {
+bdrv_set_dirty_bitmap(dest, offset, cur_bytes);
+}
+
+bytes -= cur_bytes;
+offset += cur_bytes;
+}
+
+return 0;
+}
+
 /**
  * bdrv_dirty_bitmap_merge_internal: merge src into dest.
  * Does NOT check bitmap permissions; not suitable for use as public API.
diff --git a/block/monitor/bitmap-qmp-cmds.c b/block/monitor/bitmap-qmp-cmds.c
index 9f11deec64..19845a22c4 100644
--- a/block/monitor/bitmap-qmp-cmds.c
+++ b/block/monitor/bitmap-qmp-cmds.c
@@ -34,6 +34,7 @@
 
 #include "block/block_int.h"
 #include "qapi/qapi-commands-block.h"
+#include "qapi/qapi-types-block-core.h"
 #include "qapi/error.h"
 
 /**
@@ -273,8 +274,11 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char 
*node, const char *target,
 }
 
 for (lst = bms; lst; lst = lst->next) {
+src = NULL;
+
 switch (lst->value->type) {
 const char *name, *node;
+bool has_alloc, has_name;
 case QTYPE_QSTRING:
 name = lst->value->u.local;
 src = bdrv_find_dirty_bitmap(bs, name);
@@ -286,22 +290,57 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char 
*node, const char *target,
 break;
 case QTYPE_QDICT:
 node = lst->value->u.external.node;
-name = 

[PATCH 2/2] iotests: add allocation-map-to-bitmap

2021-04-27 Thread Vladimir Sementsov-Ogievskiy
Add test to check new possibility of block-dirty-bitmap-merge command
to merge allocation map of some node to target dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 .../tests/allocation-map-to-bitmap| 64 +++
 .../tests/allocation-map-to-bitmap.out|  9 +++
 2 files changed, 73 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
 create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out

diff --git a/tests/qemu-iotests/tests/allocation-map-to-bitmap 
b/tests/qemu-iotests/tests/allocation-map-to-bitmap
new file mode 100755
index 00..bd67eed884
--- /dev/null
+++ b/tests/qemu-iotests/tests/allocation-map-to-bitmap
@@ -0,0 +1,64 @@
+#!/usr/bin/env python3
+#
+# Test parallels load bitmap
+#
+# Copyright (c) 2021 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import json
+import iotests
+from iotests import qemu_nbd_popen, qemu_img_pipe, log, file_path, 
qemu_img_create, qemu_io
+
+iotests.script_initialize(supported_fmts=['qcow2'])
+
+nbd_sock = file_path('nbd-sock', base_dir=iotests.sock_dir)
+disk = iotests.file_path('disk')
+bitmap = 'bitmap0'
+nbd_opts = f'driver=nbd,server.type=unix,server.path={nbd_sock}' \
+f',x-dirty-bitmap=qemu:dirty-bitmap:{bitmap}'
+
+qemu_img_create('-f', 'qcow2', disk, '1M')
+
+qemu_io('-c', 'write 0 512', disk)  # first 64K qcow2 cluster becomes allocated
+qemu_io('-c', 'write 150K 100K', disk)  # 3rd and 4th clusters become allocated
+
+vm = iotests.VM().add_drive(disk)
+vm.launch()
+vm.qmp_log('block-dirty-bitmap-add', node='drive0', name=bitmap,
+   persistent=True)
+vm.qmp_log('block-dirty-bitmap-merge', node='drive0', target=bitmap,
+   bitmaps=[{'node': 'drive0', 'allocation-map': 'top'}])
+vm.shutdown()
+
+with qemu_nbd_popen('--read-only', f'--socket={nbd_sock}',
+f'--bitmap={bitmap}', '-f', iotests.imgfmt, disk):
+out = qemu_img_pipe('map', '--output=json', '--image-opts', nbd_opts)
+chunks = json.loads(out)
+cluster = 64 * 1024
+
+log('dirty clusters (cluster size is 64K):')
+for c in chunks:
+assert c['start'] % cluster == 0
+assert c['length'] % cluster == 0
+if c['data']:
+continue
+
+a = c['start'] // cluster
+b = (c['start'] + c['length']) // cluster
+if b - a > 1:
+log(f'{a}-{b-1}')
+else:
+log(a)
diff --git a/tests/qemu-iotests/tests/allocation-map-to-bitmap.out 
b/tests/qemu-iotests/tests/allocation-map-to-bitmap.out
new file mode 100644
index 00..6cfc42aa4e
--- /dev/null
+++ b/tests/qemu-iotests/tests/allocation-map-to-bitmap.out
@@ -0,0 +1,9 @@
+{"execute": "block-dirty-bitmap-add", "arguments": {"name": "bitmap0", "node": 
"drive0", "persistent": true}}
+{"return": {}}
+{"execute": "block-dirty-bitmap-merge", "arguments": {"bitmaps": 
[{"allocation-map": "top", "node": "drive0"}], "node": "drive0", "target": 
"bitmap0"}}
+{"return": {}}
+Start NBD server
+dirty clusters (cluster size is 64K):
+0
+2-3
+Kill NBD server
-- 
2.29.2




[PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Vladimir Sementsov-Ogievskiy
Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
  <20200902181831.2570048-1-ebl...@redhat.com>
  https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
  https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.

2. Transaction support is not here now. Will add in future version, if
general approach is OK.

3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.

4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to _active_ bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.

5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.

6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.

Vladimir Sementsov-Ogievskiy (2):
  qapi: block-dirty-bitmap-merge: support allocation maps
  iotests: add allocation-map-to-bitmap

 qapi/block-core.json  | 31 -
 include/block/block_int.h |  4 ++
 block/dirty-bitmap.c  | 42 
 block/monitor/bitmap-qmp-cmds.c   | 55 +---
 .../tests/allocation-map-to-bitmap| 64 +++
 .../tests/allocation-map-to-bitmap.out|  9 +++
 6 files changed, 195 insertions(+), 10 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
 create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out

-- 
2.29.2




Re: [PATCH v3 22/36] block: add bdrv_remove_filter_or_cow transaction action

2021-04-27 Thread Kevin Wolf
Am 26.04.2021 um 19:18 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 26.04.2021 19:26, Kevin Wolf wrote:
> > Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > ---
> > >   block.c | 78 +++--
> > >   1 file changed, 76 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index 11f7ad0818..2fca1f2ad5 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -2929,12 +2929,19 @@ static void bdrv_replace_child(BdrvChild *child, 
> > > BlockDriverState *new_bs)
> > >   }
> > >   }
> > > +static void bdrv_child_free(void *opaque)
> > > +{
> > > +BdrvChild *c = opaque;
> > > +
> > > +g_free(c->name);
> > > +g_free(c);
> > > +}
> > > +
> > >   static void bdrv_remove_empty_child(BdrvChild *child)
> > >   {
> > >   assert(!child->bs);
> > >   QLIST_SAFE_REMOVE(child, next);
> > > -g_free(child->name);
> > > -g_free(child);
> > > +bdrv_child_free(child);
> > >   }
> > >   typedef struct BdrvAttachChildCommonState {
> > > @@ -4956,6 +4963,73 @@ static bool should_update_child(BdrvChild *c, 
> > > BlockDriverState *to)
> > >   return ret;
> > >   }
> > > +typedef struct BdrvRemoveFilterOrCowChild {
> > > +BdrvChild *child;
> > > +bool is_backing;
> > > +} BdrvRemoveFilterOrCowChild;
> > > +
> > > +/* this doesn't restore original child bs, only the child itself */
> > 
> > Hm, this comment tells me that it's intentional, but why is it correct?
> 
> that's because bdrv_remove_filter_or_cow_child_abort() aborts only
> part of  bdrv_remove_filter_or_cow_child().

I see that it aborts only part of it, but why?

Normally, a function getting a Transaction object suggests to me that on
failure, all changes the function made will be reverted, not just parts
of the changes.

> Look: bdrv_remove_filter_or_cow_child() firstly do
> bdrv_replace_child_safe(child, NULL, tran);, so bs would be restored
> by .abort() of bdrv_replace_child_safe() action.

Ah! So the reason is not that we don't want to restore child->bs, but
that bdrv_replace_child_safe() is already transactionable and will
automatically do this.

> So, improved comment may look like:
> 
> This doesn't restore original child bs, only the child itself. The bs
> would be restored by .abort() bdrv_replace_child_safe() subation of
> bdrv_remove_filter_or_cow_child() action.

"subation" is a typo for "subaction"?

Maybe something like this:

We don't have to restore child->bs here to undo bdrv_replace_child()
because that function is already transactionable and will do so in
its own .abort() callback.

> Probably it would be more correct to rename
> 
> BdrvRemoveFilterOrCowChild -> BdrvRemoveFilterOrCowChildNoBs
> bdrv_remove_filter_or_cow_child_abort -> 
> bdrv_remove_filter_or_cow_child_no_bs_abort
> bdrv_remove_filter_or_cow_child_commit -> 
> bdrv_remove_filter_or_cow_child_no_bs_commit
> 
> and assert on .abort() and .commit() that s->child->bs is NULL.

I wouldn't bother with that. It was just that the comment confused me
because it seemed to suggest that we actually don't want to restore
child->bs, when its real intention was to say that it's already restored
somewhere else.

Kevin




Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler

2021-04-27 Thread Greg Kurz
On Tue, 27 Apr 2021 11:20:07 +0200
Philippe Mathieu-Daudé  wrote:

> On 4/27/21 3:45 AM, David Gibson wrote:
> > On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote:
> >> The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset
> >> automatically.  Register a reset handler to get reset with the
> >> machine.
> >>
> >> It doesn't seem to be an issue because it is that way since the
> >> device QDev'ifycation 8 years ago, in commit a83000f5e3f
> >> ("spapr-tce: make sPAPRTCETable a proper device").
> >> Still, correct to have a proper API usage.
> > 
> > So, the reason this works now is that we explicitly call
> > device_reset() on the TCE table from the TCE tables "owner", either a
> > PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()).
> > 
> > I think we want either that, or the register_reset(), not both.
> 
> rtas_quiesce() seems to call a DeviceClass::reset() on the
> children of TYPE_SPAPR_VIO_BUS:
> 
> Abstract TYPE_VIO_SPAPR_DEVICE has the TYPE_SPAPR_VIO_BUS bus_type,
> and registers the spapr_vio_busdev_reset() handler, which calls
> spapr_vio_quiesce_one()...
> 
> So either we already have 2 resets, or the bus is never reset?
> 

rtas_quiesce() is called when the guests definitively transition
from the SLOF FW to the OS. It isn't a true reset path actually,
even if it needs to reset a few devices.

On the other hand, your patch would _really_ cause the TCE table
device to be reset twice at machine reset AFAICT.

> The bus is created in spapr_machine_init():
> 
> /* Set up VIO bus */
> spapr->vio_bus = spapr_vio_bus_init();
> 
> TYPE_SPAPR_MACHINE class registers spapr_machine_reset(), which
> manually calls qemu_devices_reset() and spapr_drc_reset_all(),
> but I can't understand if a callee resets vio_bus...

The vio_bus *is* reset:

#0  0x000100629a98 in spapr_vio_busdev_reset (qdev=0x10165c400) at 
/home/greg/Work/qemu/qemu-virtiofs/include/hw/ppc/spapr_vio.h:31
#1  0x0001009fd32c in device_transitional_reset (obj=0x10165c400) at 
/home/greg/Work/qemu/qemu-virtiofs/include/hw/qdev-core.h:17
#2  0x000100a00e24 in resettable_phase_hold (obj=0x10165c400, 
opaque=, type=) at ../../hw/core/resettable.c:182
#3  0x0001009f9108 in bus_reset_child_foreach (obj=, 
cb=0x100a00cc0 , opaque=0x0, type=) at 
../../hw/core/bus.c:97
#4  0x000100a00db8 in resettable_child_foreach (rc=0x1014f5400, 
type=RESET_TYPE_COLD, opaque=0x0, cb=0x100a00cc0 , 
obj=0x10156e600) at ../../hw/core/resettable.c:96
#5  0x000100a00db8 in resettable_phase_hold (obj=0x10156e600, 
opaque=, type=) at ../../hw/core/resettable.c:173
#6  0x0001009fcaa8 in device_reset_child_foreach (obj=, 
cb=0x100a00cc0 , opaque=0x0, type=) at 
../../hw/core/qdev.c:366
#7  0x000100a00db8 in resettable_child_foreach (rc=0x1013eef90, 
type=RESET_TYPE_COLD, opaque=0x0, cb=0x100a00cc0 , 
obj=0x10164a0e0) at ../../hw/core/resettable.c:96
#8  0x000100a00db8 in resettable_phase_hold (obj=0x10164a0e0, 
opaque=, type=) at ../../hw/core/resettable.c:173
#9  0x0001009f9108 in bus_reset_child_foreach (obj=, 
cb=0x100a00cc0 , opaque=0x0, type=) at 
../../hw/core/bus.c:97
#10 0x000100a00db8 in resettable_child_foreach (rc=0x1012b1a00, 
type=RESET_TYPE_COLD, opaque=0x0, cb=0x100a00cc0 , 
obj=0x10154d4b0) at ../../hw/core/resettable.c:96
#11 0x000100a00db8 in resettable_phase_hold (obj=obj@entry=0x10154d4b0, 
opaque=opaque@entry=0x0, type=type@entry=RESET_TYPE_COLD) at 
../../hw/core/resettable.c:173
#12 0x000100a01794 in resettable_assert_reset (obj=0x10154d4b0, 
type=) at ../../hw/core/resettable.c:60
#13 0x000100a01c60 in resettable_reset (obj=0x10154d4b0, type=) at ../../hw/core/resettable.c:45
#14 0x000100a020ec in resettable_cold_reset_fn (opaque=) at 
../../hw/core/resettable.c:269
#15 0x000100a00718 in qemu_devices_reset () at ../../hw/core/reset.c:69
#16 0x000100624024 in spapr_machine_reset (machine=0x101545480) at 
../../hw/ppc/spapr.c:1587
#17 0x0001007b8128 in qemu_system_reset (reason=) at 
../../softmmu/runstate.c:442
#18 0x0001007b8fa8 in main_loop_should_exit () at 
../../softmmu/runstate.c:687
#19 0x0001007b8fa8 in qemu_main_loop () at ../../softmmu/runstate.c:721
#20 0x0001002f5150 in main (argc=, argv=, 
envp=) at ../../softmmu/main.c:50

And it seems rtas_quiesce() could just do bus_cold_reset(>bus)
rather than open-coding the walk of vio_bus children.



Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/27/21 3:45 AM, David Gibson wrote:
> On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote:
>> The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset
>> automatically.  Register a reset handler to get reset with the
>> machine.
>>
>> It doesn't seem to be an issue because it is that way since the
>> device QDev'ifycation 8 years ago, in commit a83000f5e3f
>> ("spapr-tce: make sPAPRTCETable a proper device").
>> Still, correct to have a proper API usage.
> 
> So, the reason this works now is that we explicitly call
> device_reset() on the TCE table from the TCE tables "owner", either a
> PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()).
> 
> I think we want either that, or the register_reset(), not both.

rtas_quiesce() seems to call a DeviceClass::reset() on the
children of TYPE_SPAPR_VIO_BUS:

Abstract TYPE_VIO_SPAPR_DEVICE has the TYPE_SPAPR_VIO_BUS bus_type,
and registers the spapr_vio_busdev_reset() handler, which calls
spapr_vio_quiesce_one()...

So either we already have 2 resets, or the bus is never reset?

The bus is created in spapr_machine_init():

/* Set up VIO bus */
spapr->vio_bus = spapr_vio_bus_init();

TYPE_SPAPR_MACHINE class registers spapr_machine_reset(), which
manually calls qemu_devices_reset() and spapr_drc_reset_all(),
but I can't understand if a callee resets vio_bus...



Re: [PATCH v3 0/9] simplebench improvements

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

kindly ping.

I probably can pull it as is when 6.0 finally released and new development 
phase opened..

Still, John, if you have some time for it, could you look through the changes 
and may be give missed r-bs?

changes v2->v3:
 
 https://patchew.org/QEMU/20210304101738.20248-1-vsement...@virtuozzo.com/diff/20210323134734.72930-1-vsement...@virtuozzo.com/


23.03.2021 16:47, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

Here are some improvements to simplebench lib, to support my
"qcow2: compressed write cache" series.

v3:
01: use simpler logic
02,04-06: add John's r-b
07: use BooleanOptionalAction and
 initial_run=args.initial_run
08: rewrite so that we have a new --drop-caches option

Vladimir Sementsov-Ogievskiy (9):
   simplebench: bench_one(): add slow_limit argument
   simplebench: bench_one(): support count=1
   simplebench/bench-backup: add --compressed option
   simplebench/bench-backup: add target-cache argument
   simplebench/bench_block_job: handle error in BLOCK_JOB_COMPLETED
   simplebench/bench-backup: support qcow2 source files
   simplebench/bench-backup: add --count and --no-initial-run
   simplebench/bench-backup: add --drop-caches argument
   MAINTAINERS: update Benchmark util: add git tree

  MAINTAINERS|  1 +
  scripts/simplebench/bench-backup.py| 95 +-
  scripts/simplebench/bench_block_job.py | 42 +++-
  scripts/simplebench/simplebench.py | 28 +++-
  4 files changed, 144 insertions(+), 22 deletions(-)




--
Best regards,
Vladimir



Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands

2021-04-27 Thread Francisco Iglesias
On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote:
> On Fri, Apr 23, 2021 at 4:46 PM Bin Meng  wrote:
> >
> > On Mon, Feb 8, 2021 at 10:41 PM Bin Meng  wrote:
> > >
> > > On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
> > >  wrote:
> > > >
> > > > Hi Bin,
> > > >
> > > > On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
> > > > >  wrote:
> > > > > >
> > > > > > Dear Bin,
> > > > > >
> > > > > > On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hi Bin,
> > > > > > > >
> > > > > > > > On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Bin,
> > > > > > > > > >
> > > > > > > > > > On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > >
> > > > > > > > > > > > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
> > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
> > > > > > > > > > > > > > > From: Bin Meng 
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The m25p80 model uses s->needed_bytes to indicate 
> > > > > > > > > > > > > > > how many follow-up
> > > > > > > > > > > > > > > bytes are expected to be received after it 
> > > > > > > > > > > > > > > receives a command. For
> > > > > > > > > > > > > > > example, depending on the address mode, either 
> > > > > > > > > > > > > > > 3-byte address or
> > > > > > > > > > > > > > > 4-byte address is needed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > For fast read family commands, some dummy cycles 
> > > > > > > > > > > > > > > are required after
> > > > > > > > > > > > > > > sending the address bytes, and the dummy cycles 
> > > > > > > > > > > > > > > need to be counted
> > > > > > > > > > > > > > > in s->needed_bytes. This is where the mess began.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As the variable name (needed_bytes) indicates, 
> > > > > > > > > > > > > > > the unit is in byte.
> > > > > > > > > > > > > > > It is not in bit, or cycle. However for some 
> > > > > > > > > > > > > > > reason the model has
> > > > > > > > > > > > > > > been using the number of dummy cycles for 
> > > > > > > > > > > > > > > s->needed_bytes. The right
> > > > > > > > > > > > > > > approach is to convert the number of dummy cycles 
> > > > > > > > > > > > > > > to bytes based on
> > > > > > > > > > > > > > > the SPI protocol, for example, 6 dummy cycles for 
> > > > > > > > > > > > > > > the Fast Read Quad
> > > > > > > > > > > > > > > I/O (EBh) should be converted to 3 bytes per the 
> > > > > > > > > > > > > > > formula (6 * 4 / 8).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > While not being the original implementor I must 
> > > > > > > > > > > > > > assume that above solution was
> > > > > > > > > > > > > > considered but not chosen by the developers due to 
> > > > > > > > > > > > > > it is inaccuracy (it
> > > > > > > > > > > > > > wouldn't be possible to model exacly 6 dummy 
> > > > > > > > > > > > > > cycles, only a multiple of 8,
> > > > > > > > > > > > > > meaning that if the controller is wrongly 
> > > > > > > > > > > > > > programmed to generate 7 the error
> > > > > > > > > > > > > > wouldn't be caught and the controller will still be 
> > > > > > > > > > > > > > considered "correct"). Now
> > > > > > > > > > > > > > that we have this detail in the implementation I'm 
> > > > > > > > > > > > > > in favor of keeping it, this
> > > > > > > > > > > > > > also because the detail is already in use for 
> > > > > > > > > > > > > > catching exactly above error.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I found no clue from the commit message that my 
> > > > > > > > > > > > > proposed solution here
> > > > > > > > > > > > > was ever considered, otherwise all SPI controller 
> > > > > > > > > > > > > models supporting
> > > > > > > > > > > > > software generation should have been found out 
> > > > > > > > > > > > > seriously broken long
> > > > > > > > > > > > > time ago!
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The controllers you are referring to might lack support 
> > > > > > > > > > > > for 

[PATCH v2] hw/nvme/ctrl: fix csi field for cns 0x00 and 0x11

2021-04-27 Thread Gollu Appalanaidu
As per the TP 4056d Namespace types CNS 0x00 and CNS 0x11
CSI field shouldn't use but it is being used for these two
Identify command CNS values, fix that.

Remove 'nvme_csi_has_nvm_support()' helper as suggested by
Klaus we can safely assume NVM command set support for all
namespaces.

Suggested-by: Klaus Jensen 
Signed-off-by: Gollu Appalanaidu 
---
-v2: add sugggestions from Klaus
We can Remove 'nvme_csi_has_nvm_support()' helper, we can
assume NVM command set support for all namespaces.

 hw/nvme/ctrl.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 2e7498a73e..7fcd699235 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4178,16 +4178,6 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, 
NvmeRequest *req)
 return nvme_c2h(n, id, sizeof(id), req);
 }
 
-static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns)
-{
-switch (ns->csi) {
-case NVME_CSI_NVM:
-case NVME_CSI_ZONED:
-return true;
-}
-return false;
-}
-
 static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
 {
 trace_pci_nvme_identify_ctrl();
@@ -4244,7 +4234,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest 
*req, bool active)
 }
 }
 
-if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
+if (active || ns->csi == NVME_CSI_NVM) {
 return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req);
 }
 
@@ -4315,7 +4305,7 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, 
NvmeRequest *req,
 }
 }
 
-if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
+if (c->csi == NVME_CSI_NVM) {
 return nvme_rpt_empty_id_struct(n, req);
 } else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) {
 return nvme_c2h(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned),
-- 
2.17.1




Re: [PATCH] hw/block/nvme: fix csi field for cns 0x00 and 0x11

2021-04-27 Thread Gollu Appalanaidu

On Mon, Apr 26, 2021 at 01:03:04PM +0200, Klaus Jensen wrote:

On Apr 26 13:16, Gollu Appalanaidu wrote:

As per the TP 4056d Namespace types CNS 0x00 and CNS 0x11
CSI field shouldn't use but it is being used for these two
Identify command CNS values, fix that.

Signed-off-by: Gollu Appalanaidu 
---
hw/nvme/ctrl.c | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 2e7498a73e..1657b1d04a 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4244,11 +4244,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, 
NvmeRequest *req, bool active)
   }
   }

-if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
-return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req);
+if (active && nvme_csi_has_nvm_support(ns)) {
+goto out;
+} else if (!active && ns->csi == NVME_CSI_NVM) {
+goto out;
+} else {
+return NVME_INVALID_CMD_SET | NVME_DNR;
   }

-return NVME_INVALID_CMD_SET | NVME_DNR;
+out:
+return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req);
}

static uint16_t nvme_identify_ns_attached_list(NvmeCtrl *n, NvmeRequest *req)
--
2.17.1




Looking closer at this, since we only support the NVM and Zoned 
command sets, we can get rid of the `nvme_csi_has_nvm_support()` 
helper and just assume NVM command set support for all namespaces. The 
way different command sets are handled doesn't scale anyway, so we 
might as well simplify the logic a bit.


Something like this (compile-tested only) patch maybe?

diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c
index 2e7498a73e70..7fcd6992358d 100644
--- i/hw/nvme/ctrl.c
+++ w/hw/nvme/ctrl.c
@@ -4178,16 +4178,6 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, 
NvmeRequest *req)
return nvme_c2h(n, id, sizeof(id), req);
}

-static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns)
-{
-switch (ns->csi) {
-case NVME_CSI_NVM:
-case NVME_CSI_ZONED:
-return true;
-}
-return false;
-}
-
static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
{
trace_pci_nvme_identify_ctrl();
@@ -4244,7 +4234,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest 
*req, bool active)
}
}

-if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
+if (active || ns->csi == NVME_CSI_NVM) {
return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req);
}

@@ -4315,7 +4305,7 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, 
NvmeRequest *req,
}
}

-if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) {
+if (c->csi == NVME_CSI_NVM) {
return nvme_rpt_empty_id_struct(n, req);
} else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) {
return nvme_c2h(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned),



Sure, will make changes and submit v2