zero-copy: Merge header & pages send in a single write

Leonardo Brás Tue, 25 Oct 2022 06:30:13 -0700

On Tue, 2022-10-25 at 11:51 +0200, Juan Quintela wrote:
> Leonardo Bras <leob...@redhat.com> wrote:
> > When zero-copy-send is enabled, each loop iteration of the
> > multifd_send_thread will calls for qio_channel_write_*() twice: The first
> > one for sending the header without zero-copy flag and the second one for
> > sending the memory pages, with zero-copy flag enabled.
> > 
> > This ends up calling two syscalls per loop iteration, where one should be
> > enough.
> > 
> > Also, since the behavior for non-zero-copy write is synchronous, and the
> > behavior for zero-copy write is asynchronous, it ends up interleaving
> > synchronous and asynchronous writes, hurting performance that could
> > otherwise be improved.
> > 
> > The choice of sending the header without the zero-copy flag in a separated
> > write happened because the header memory area would be reused in the next
> > write, so it was almost certain to have changed before the kernel could
> > send the packet.
> > 
> > To send the packet with zero-copy, create an array of header area instead
> > of a single one, and use a different header area after each write. Also,
> > flush the sending queue after all the headers have been used.
> > 
> > To avoid adding a zero-copy conditional in multifd_send_fill_packet(),
> > add a packet parameter (the packet that should be filled). This way it's
> > simpler to pick which element of the array will be used as a header.
> > 
> > Suggested-by: Juan Quintela <quint...@redhat.com>
> > Signed-off-by: Leonardo Bras <leob...@redhat.com>
> 
> >  
> > +            if (use_zero_copy_send) {
> > +                p->packet_idx = (p->packet_idx + 1) % HEADER_ARR_SZ;
> > +
> > +                if (!p->packet_idx && (multifd_zero_copy_flush(p->c) < 0)) 
> > {
> > +                    break;
> > +                }
> > +                header = (void *)p->packet + p->packet_idx * p->packet_len;
> 
> Isn't this equivalent to?
> 
>       header = &(p->packet[p->packet_idx]);



So, that's the thing:
MultiFDPacket_t does contain a flexible array member (offset[]) at the end of
the struct, which is used together with dynamic allocation to create the
perception of a variable size struct. 

This is used to make sure the packet will be able to carry the offset for each
memory page getting sent, independent of it's size:

multifd_save_setup (){ 
        uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
        [...]
        p->packet_len = sizeof(MultiFDPacket_t)
                      + sizeof(uint64_t) * page_count;
        p->packet = g_malloc0(p->packet_len);
        [...]
}

So the size of the header is actually p->packet_len, and not
sizeof(MultiFDPacket_t). This means using the default pointer arithmetic does
not work for this case, and

header = &(p->packet[p->packet_idx]) 

will actually point to the wrong place in memory, since it does not consider the
the actual size of offset[] array.

> 
> >      for (i = 0; i < thread_count; i++) {
> >          MultiFDSendParams *p = &multifd_send_state->params[i];
> > +        int j;
> 
> For new code you can:
> 
> >          qemu_mutex_init(&p->mutex);
> >          qemu_sem_init(&p->sem, 0);
> > @@ -940,9 +940,13 @@ int multifd_save_setup(Error **errp)
> >          p->pages = multifd_pages_init(page_count);
> >          p->packet_len = sizeof(MultiFDPacket_t)
> >                        + sizeof(uint64_t) * page_count;
> > -        p->packet = g_malloc0(p->packet_len);
> > -        p->packet->magic = cpu_to_be32(MULTIFD_MAGIC);
> > -        p->packet->version = cpu_to_be32(MULTIFD_VERSION);
> > +        p->packet = g_malloc0_n(HEADER_ARR_SZ, p->packet_len);
> > +        for (j = 0; j < HEADER_ARR_SZ ; j++) {
> 
>            for (int j = 0; j < HEADER_ARR_SZ ; j++) {

Oh, sweet. I am very fond of C99, just not used to it getting accepted upstream.
Thanks for the tip :)

> 
> > +            MultiFDPacket_t *packet = (void *)p->packet + j * 
> > p->packet_len;
> > +            packet->magic = cpu_to_be32(MULTIFD_MAGIC);
> > +            packet->version = cpu_to_be32(MULTIFD_VERSION);
> 
> Can't you use here:
> 
>             packet[j].magic = cpu_to_be32(MULTIFD_MAGIC);
>             packet[j].version = cpu_to_be32(MULTIFD_VERSION);
> 
> And call it a day?

Not actually. The size of this struct is different from sizeof(MultiFDPacket_t),
so it does not work. See above.

> 
> The rest is fine for me.  Thanks for the effort.

Thanks for reviewing Juan!

Leo

> 
> Later, Juan.
>

Re: [RFC PATCH 2/4] migration/multifd/zero-copy: Merge header & pages send in a single write

Reply via email to