On Thu, Jul 18, 2024 at 07:32:05PM -0300, Fabiano Rosas wrote:
> Peter Xu <pet...@redhat.com> writes:
> 
> > On Thu, Jul 18, 2024 at 06:27:32PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <pet...@redhat.com> writes:
> >> 
> >> > On Thu, Jul 18, 2024 at 04:39:00PM -0300, Fabiano Rosas wrote:
> >> >> v2 is ready, but unfortunately this approach doesn't work. When client A
> >> >> takes the payload, it fills it with it's data, which may include
> >> >> allocating memory. MultiFDPages_t does that for the offset. This means
> >> >> we need a round of free/malloc at every packet sent. For every client
> >> >> and every allocation they decide to do.
> >> >
> >> > Shouldn't be a blocker?  E.g. one option is:
> >> >
> >> >     /* Allocate both the pages + offset[] */
> >> >     MultiFDPages_t *pages = g_malloc0(sizeof(MultiFDPages_t) +
> >> >                                       sizeof(ram_addr_t) * n, 1);
> >> >     pages->allocated = n;
> >> >     pages->offset = &pages[1];
> >> >
> >> > Or.. we can also make offset[] dynamic size, if that looks less tricky:
> >> >
> >> > typedef struct {
> >> >     /* number of used pages */
> >> >     uint32_t num;
> >> >     /* number of normal pages */
> >> >     uint32_t normal_num;
> >> >     /* number of allocated pages */
> >> >     uint32_t allocated;
> >> >     RAMBlock *block;
> >> >     /* offset of each page */
> >> >     ram_addr_t offset[0];
> >> > } MultiFDPages_t;
> >> 
> >> I think you missed the point. If we hold a pointer inside the payload,
> >> we lose the reference when the other client takes the structure and puts
> >> its own data there. So we'll need to alloc/free everytime we send a
> >> packet.
> >
> > For option 1: when the buffer switch happens, MultiFDPages_t will switch as
> > a whole, including its offset[], because its offset[] always belong to this
> > MultiFDPages_t.  So yes, we want to lose that *offset reference together
> > with MultiFDPages_t here, so the offset[] always belongs to one single
> > MultiFDPages_t object for its lifetime.
> 
> MultiFDPages_t is part of MultiFDSendData, it doesn't get allocated
> individually:
> 
> struct MultiFDSendData {
>     MultiFDPayloadType type;
>     union {
>         MultiFDPages_t ram_payload;
>     } u;
> };
> 
> (and even if it did, then we'd lose the pointer to ram_payload anyway -
> or require multiple free/alloc)

IMHO it's the same.

The core idea is we allocate a buffer to put MultiFDSendData which may
contain either Pages_t or DeviceState_t, and the size of the buffer should
be MAX(A, B).

> 
> >
> > For option 2: I meant MultiFDPages_t will have no offset[] pointer anymore,
> > but make it part of the struct (MultiFDPages_t.offset[]).  Logically it's
> > the same as option 1 but maybe slight cleaner.  We just need to make it
> > sized 0 so as to be dynamic in size.
> 
> Seems like an undefined behavior magnet. If I sent this as the first
> version, you'd NACK me right away.
> 
> Besides, it's an unnecessary restriction to impose in the client
> code. And like above, we don't allocate the struct directly, it's part
> of MultiFDSendData, that's an advantage of using the union.
> 
> I think we've reached the point where I'd like to hear more concrete
> reasons for not going with the current proposal, except for the
> simplicity argument you already put. I like the union idea, but OTOH we
> already have a working solution right here.

I think the issue with current proposal is each client will need to
allocate (N+1)*buffer, so more user using it the more buffers we'll need (M
users, then M*(N+1)*buffer).  Currently it seems to me we will have 3 users
at least: RAM, VFIO, and some other VMSD devices TBD in mid-long futures;
the latter two will share the same DeviceState_t.  Maybe vDPA as well at
some point?  Then 4.

I'd agree with this approach only if multifd is flexible enough to not even
know what's the buffers, but it's not the case, and we seem only care about
two:

  if (type==RAM)
     ...
  else
     assert(type==DEVICE);
     ...

In this case I think it's easier we have multifd manage all the buffers
(after all, it knows them well...).  Then the consumption is not
M*(N+1)*buffer, but (M+N)*buffer.

Perhaps push your tree somewhere so we can have a quick look?  I'm totally
lost when you said I'll nack it.. so maybe I didn't really get what you
meant.  Codes may clarify that.

-- 
Peter Xu


Reply via email to