On Fri, Feb 02, 2024 at 06:08:22PM -0300, Fabiano Rosas wrote: > pet...@redhat.com writes: > > > From: Peter Xu <pet...@redhat.com> > > > > As reported correctly by Fabiano [1], MultiFDSendParams.packet_num is buggy > > to be assigned and stored. Consider two consequent operations of: (1) > > queue a job into multifd send thread X, then (2) queue another sync request > > to the same send thread X. Then the MultiFDSendParams.packet_num will be > > assigned twice, and the first assignment can get lost already. > > > > To avoid that, we move the packet_num assignment from p->packet_num into > > where the thread will fill in the packet. Use atomic operations to protect > > the field, making sure there's no race. > > > > Note that atomic fetch_add() may not be good for scaling purposes, however > > multifd should be fine as number of threads should normally not go beyond > > 16 threads. Let's leave that concern for later but fix the issue first. > > > > There's also a trick on how to make it always work even on 32 bit hosts for > > uint64_t packet number. Switching to uintptr_t as of now to simply the > > case. It will cause packet number to overflow easier on 32 bit, but that > > shouldn't be a major concern for now as 32 bit systems is not the major > > audience for any performance concerns like what multifd wants to address. > > > > We also need to move multifd_send_state definition upper, so that > > multifd_send_fill_packet() can reference it. > > > > [1] https://lore.kernel.org/r/87o7d1jlu5....@suse.de > > > > Reported-by: Fabiano Rosas <faro...@suse.de> > > Signed-off-by: Peter Xu <pet...@redhat.com> > > Elena had reported this in October already. > > Reported-by: Elena Ufimtseva <elena.ufimts...@oracle.com>
Ah, I'll do the replacement. > Reviewed-by: Fabiano Rosas <faro...@suse.de> Thanks, -- Peter Xu