On Wed, Mar 04, 2026 at 10:43:53AM +0100, Alexander Mikhalitsyn wrote:
> Am Mo., 2. März 2026 um 22:47 Uhr schrieb Peter Xu <[email protected]>:
> >
> > On Tue, Feb 17, 2026 at 04:25:16PM +0100, Alexander Mikhalitsyn wrote:
> > > From: Alexander Mikhalitsyn <[email protected]>
> > >
> > > Add VMSTATE_VARRAY_OF_POINTER_TO_STRUCT_ALLOC, which
> > > helps to save/restore a dynamic array of pointers to
> > > structures.
> 
> Dear Peter,
> 
> >
> > Sorry for the late response.
> 
> No problem at all, thanks for reviewing ;)
> 
> >
> > >
> > > Signed-off-by: Alexander Mikhalitsyn 
> > > <[email protected]>
> > > ---
> > >  include/migration/vmstate.h | 21 +++++++++
> > >  migration/vmstate-types.c   | 88 +++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 109 insertions(+)
> > >
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index 89f9f49d20a..bc6495a7f67 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -265,6 +265,7 @@ extern const VMStateInfo vmstate_info_bitmap;
> > >  extern const VMStateInfo vmstate_info_qtailq;
> > >  extern const VMStateInfo vmstate_info_gtree;
> > >  extern const VMStateInfo vmstate_info_qlist;
> > > +extern const VMStateInfo vmstate_info_ptrs_array_entry;
> > >
> > >  #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0)
> > >  /*
> > > @@ -537,6 +538,26 @@ extern const VMStateInfo vmstate_info_qlist;
> > >      .offset     = vmstate_offset_array(_s, _f, _type*, _n),          \
> > >  }
> > >
> > > +/*
> > > + * For migrating a dynamically allocated uint32-indexed array
> > > + * of pointers to structures (with NULL entries and with auto memory 
> > > allocation).
> > > + *
> > > + * _type: type of structure pointed to
> > > + * _vmsd: VMSD for structure
> > > + * start: size of structure pointed to (for auto memory allocation)
> > > + */
> > > +#define VMSTATE_VARRAY_OF_POINTER_TO_STRUCT_ALLOC(_field, _state, 
> > > _field_num, _version, _vmsd, _type) { \
> > > +    .name       = (stringify(_field)),                                \
> > > +    .version_id = (_version),                                         \
> > > +    .num_offset = vmstate_offset_value(_state, _field_num, uint32_t), \
> > > +    .info       = &vmstate_info_ptrs_array_entry,                     \
> >
> > I wonder if we need this hard-coded new info structure for this, more
> > below.
> >
> > > +    .vmsd       = &(_vmsd),                                           \
> > > +    .start      = sizeof(_type),                                      \
> > > +    .size       = sizeof(_type *),                                    \
> > > +    .flags      = VMS_VARRAY_UINT32|VMS_POINTER,                      \
> > > +    .offset     = vmstate_offset_pointer(_state, _field, _type *),    \
> > > +}
> > > +
> > >  #define VMSTATE_VARRAY_OF_POINTER_UINT32(_field, _state, _field_num, 
> > > _version, _info, _type) { \
> > >      .name       = (stringify(_field)),                                   
> > >  \
> > >      .version_id = (_version),                                            
> > >  \
> > > diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
> > > index 89cb2114721..3335377cd07 100644
> > > --- a/migration/vmstate-types.c
> > > +++ b/migration/vmstate-types.c
> > > @@ -942,3 +942,91 @@ const VMStateInfo vmstate_info_qlist = {
> > >      .get  = get_qlist,
> > >      .put  = put_qlist,
> > >  };
> > > +
> > > +static int put_ptrs_array_entry(QEMUFile *f, void *ppv, size_t 
> > > unused_size,
> > > +                                const VMStateField *field, JSONWriter 
> > > *vmdesc)
> > > +{
> > > +    const VMStateDescription *vmsd = field->vmsd;
> > > +    int ret;
> > > +    Error *local_err = NULL;
> > > +    void *pv;
> > > +
> > > +    /*
> > > +     * (ppv) is an address of an i-th element of a dynamic array.
> > > +     *
> > > +     * (ppv) can not be NULL unless we have some regression/bug in
> > > +     * vmstate_save_state_v(), because it is result of pointer arithemic 
> > > like:
> > > +     * first_elem + size * i.
> > > +     */
> > > +    if (ppv == NULL) {
> > > +        error_report("vmstate: put_ptrs_array_entry must be called with 
> > > ppv != NULL");
> > > +        return -EINVAL;
> > > +    }
> > > +
> > > +    /* get a pointer to a structure */
> > > +    pv = *(void **)ppv;
> >
> > Here IIUC it's almost what VMS_ARRAY_OF_POINTER should do.  Could we
> > declare the vmsd with VMS_ARRAY_OF_POINTER so as to save this open-coded
> > dereference of pointer?
> 
> Yes, this is something I tried to do in the first place, but this
> didn't work for my use-case.
> 
> Let me explain what I'm doing in detail. I have the following structure:
> 
> typedef struct NvmeCtrl {
>      ...
>      uint32_t num_queues; // < MAX number of elements in sq
>      NvmeSQueue **sq;
>      ...
> } NvmeCtrl;
> 
> Suppose we have variable n with type NvmeCtrl *.
> 
> // somewhere early during initialization of QEMU device:
> n->sq = g_new0(NvmeSQueue *, n->num_queues);
> 
> // then in runtime (depending on what *guest* wants) we might create
> some sq[i], e.g:
> NvmeSQueue *sq;
> sq = g_malloc0(sizeof(*sq));
> n->sq[3] = sq;
> 
> please, note that n->sq[0], n->sq[1], n->sq[2] can still be NULL. So,
> we allow holes in that array.
> 
> What I found is that if I raise VMS_ARRAY_OF_POINTER flag on VMStateField, 
> then
> migration code in vmstate_load_state() will expect to see so-called
> "fake nullptr field" (vmsd_create_fake_nullptr_field() function call).
> 
> But this is a problem, because our array at the time of
> vmstate_load_state() looks like (N == NULL):
> 0     1     2     3     4     5
> N    N    N     N     N    N

Yes, I actually just notice that the old nullptr trick only works if the
dest QEMU will know if the pointer is NULL... that makes that interface
pretty much, useless in most cases.. :(

So it's not source telling dest that "there's a null pointer in the array",
it's just that both sides know that.  Checking the original patch of that,
indeed it mentioned it's used in ChannelSubSys.css in hw/s390x/css.c:

https://lore.kernel.org/all/[email protected]/

So I guess the qemu cmdline for that s390 guest will make sure the pointers
will be pre-initialized already..  So yes, it won't work for you.  You're
looking for a full dynamic array of pointers.

> 
> while after vmstate_load_state() I expect it to be:
> 0     1     2     3     4     5
> N    N     N   SQ   N    N
> 
> so, somebody should allocate memory for sq[3] and load data from image
> into that memory.
> 
> Probably, it makes sense to somehow implement a combination of
> VMS_ARRAY_OF_POINTER and  VMS_ALLOC,
> but I tried to be as less invasive as I can and decided to introduce a
> new kind of VMStateField instead.

Yes.  We already used VMS_ALLOC for top level allocation.  We may need
another flag for per-array allocation.

> 
> >
> > > +
> > > +    if (pv == NULL) {
> > > +        /* write a mark telling that there was a NULL pointer */
> > > +        qemu_put_byte(f, false);
> > > +        return 0;
> > > +    }
> > > +
> > > +    /* if pointer is not NULL, dump the structure contents with help of 
> > > vmsd */
> > > +    qemu_put_byte(f, true);
> > > +    ret = vmstate_save_state(f, vmsd, pv, vmdesc, &local_err);
> >
> > This looks really like VMS_STRUCT, except that it also tries to detect if
> > the pointer is NULL, then save some dumps.
> 
> Precisely, I'm mimicking behavior of VMS_STRUCT | VMS_ALLOC and
> VMS_ARRAY_OF_POINTER here.
> 
> >
> > Is it feasible to make both queues to always be available so as to reuse
> > VMS_STRUCT?  Or is it intentional here to avoid dumping queues where the
> > pointer is NULL?
> 
> By "available" you mean to always allocate memory for all n->sq[i] elements?
> 
> >
> > Even if for the latter, I wonder if we can have better way to do it.  For
> > example, if we can integrate this directly into vmstate_save/load_state()
> > to support some new flag, VMS_ARRAY_OF_POINTER_DYNAMIC, then we can declare
> > some more "official" way to say some pointer of an dynamic array is NULL.
> >
> 
> Yeah, I thought about this too, I believe this is a great idea. But
> taking into account that
> this is my first contribution to QEMU and I'm learning code around I tried to 
> be
> very conservative and don't change any generic code ;-)
> 
> I'm happy to do this work in -v3 if you would like me to do it and
> guide me a bit (I have just send a v2 series,
> but there was no changes in migration API part, only NVMe specifics).

Yes, it would be nice if we can use a good base on this part of work.  The
problem with your proposal is even if it doesn't touch vmstate core, we
will then need to maintain that special .info struct, that's also a burden.
So maybe it's better we do it the best we can come up with.

I changed slightly on the name of the new flag, DYNAMIC might be fine but I
wonder if we can choose something clearer on what it does.

Maybe VMS_ARRAY_OF_POINTER_ALLOW_NULL? It should at least imply two things:

(1) The array being migrated can contain NULL pointers anywhere in the
    elements.  We will need to change the nullptr trick a bit for this: we
    must send a byte (even if it's non-NULL) beforehand saying if this
    element is NULL.  We could reuse 0x30 for NULL, but we need e.g. 0x31
    to say it's non-NULL.  Then this will work even if dest QEMU doesn't
    know which pointer is NULL (unlike s390's css).

(2) If it's non-NULL, dest QEMU allocates the array element memory during
    loading the vmsd.  It plays similarly a role of VMS_ALLOC but for the
    array elements.  We could use a separate flag but I don't see a case
    where this flag can be used without being able to migrate NULL
    elements, so maybe we can stick with one flag so far until we see
    another use case.

The new flag must only be used with VMS_ARRAY_OF_POINTER for sure.  It
should likely verify that each pointer on dest is NULL before hand before
allocating that array element too in case we leak things.

We will also need some coverage in test-vmstate.c for this.

It would be great if you will volunteer on this piece of work, I will try
to help whatever way I can on reviews.  You can also consider doing it
separately then migration can collect it in advance when ready on its own
(proven that it'll be 100% useful here for nvme very soon).

But I know it might be too much to ask, so if you want I can also take a
look at this problem.  Let me know.

Thanks,

-- 
Peter Xu


Reply via email to