On Tue, Mar 1, 2022 at 12:57 AM Peter Xu <pet...@redhat.com> wrote:
>
> On Fri, Feb 18, 2022 at 05:57:13PM +0100, Juan Quintela wrote:
> > I did a change on:
> >
> > commit d48c3a044537689866fe44e65d24c7d39a68868a
> > Author: Juan Quintela <quint...@redhat.com>
> > Date:   Fri Nov 19 15:35:58 2021 +0100
> >
> >     multifd: Use a single writev on the send side
> >
> >     Until now, we wrote the packet header with write(), and the rest of the
> >     pages with writev().  Just increase the size of the iovec and do a
> >     single writev().
> >
> >     Signed-off-by: Juan Quintela <quint...@redhat.com>
> >     Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
> >
> > And now we need to "perserve" this header until we do the sync,
> > otherwise we are overwritting it with other things.
> >
> > What testing have you done after this commit?
> >
> > Notice that it is not _complicated_ to fix it, I will try to come with
> > some idea on monday, but basically is having an array of buffers for
> > each thread, and store them until we call a sync().
>
> Or can we conditionally merge the two write()s?  IMHO the array of buffers
> idea sounds too complicated, and I'm not extremely sure whether it'll pay
> off at last.  We could keep the two write()s with ZEROCOPY enabled, and use
> the merged version otherwise.

I think that's a great idea!
It would optimize the non-zerocopy version while letting us have a
simpler zerocopy implementation.
The array of buffers implementation would either require us to have a
'large' amount of memory for keeping the headers, or having flush
happening too often.

>
> Btw, is there any performance measurements for above commit d48c3a044537?
> I had a feeling that the single write() may not help that much, because for
> multifd the bottleneck should be on the nic not on the processor.

I am quite curious about those numbers too.

>
> IOW, we could find that the major time used does not fall into the
> user<->kernel switches (which is where the extra overhead of write()
> syscall, iiuc), but we simply blocked on any of the write()s because the
> socket write buffer is full...  So we could have saved some cpu cycles by
> merging the calls, but performance-wise we may not get much.
>
> Thanks,
>
> --
> Peter Xu
>

Thanks Peter!


Reply via email to