On Tue, Mar 1, 2022 at 12:57 AM Peter Xu <pet...@redhat.com> wrote: > > On Fri, Feb 18, 2022 at 05:57:13PM +0100, Juan Quintela wrote: > > I did a change on: > > > > commit d48c3a044537689866fe44e65d24c7d39a68868a > > Author: Juan Quintela <quint...@redhat.com> > > Date: Fri Nov 19 15:35:58 2021 +0100 > > > > multifd: Use a single writev on the send side > > > > Until now, we wrote the packet header with write(), and the rest of the > > pages with writev(). Just increase the size of the iovec and do a > > single writev(). > > > > Signed-off-by: Juan Quintela <quint...@redhat.com> > > Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > > > And now we need to "perserve" this header until we do the sync, > > otherwise we are overwritting it with other things. > > > > What testing have you done after this commit? > > > > Notice that it is not _complicated_ to fix it, I will try to come with > > some idea on monday, but basically is having an array of buffers for > > each thread, and store them until we call a sync(). > > Or can we conditionally merge the two write()s? IMHO the array of buffers > idea sounds too complicated, and I'm not extremely sure whether it'll pay > off at last. We could keep the two write()s with ZEROCOPY enabled, and use > the merged version otherwise.
I think that's a great idea! It would optimize the non-zerocopy version while letting us have a simpler zerocopy implementation. The array of buffers implementation would either require us to have a 'large' amount of memory for keeping the headers, or having flush happening too often. > > Btw, is there any performance measurements for above commit d48c3a044537? > I had a feeling that the single write() may not help that much, because for > multifd the bottleneck should be on the nic not on the processor. I am quite curious about those numbers too. > > IOW, we could find that the major time used does not fall into the > user<->kernel switches (which is where the extra overhead of write() > syscall, iiuc), but we simply blocked on any of the write()s because the > socket write buffer is full... So we could have saved some cpu cycles by > merging the calls, but performance-wise we may not get much. > > Thanks, > > -- > Peter Xu > Thanks Peter!