On 17.09.2024 19:50, Peter Xu wrote:
On Tue, Sep 17, 2024 at 07:07:10PM +0200, Cédric Le Goater wrote:
[ ... ]
I as a patch writer always like to do that when it's essential. Normally
the case is I don't have enough reviewer resources to help me get a better
design, or discuss about it.
Right, but we can't keep providing a moving target. See the thread pool
discussion for an example. It's hard to work that way. The discussion
here is similar, we introduced the union, now we're moving to the
struct. And you're right that the changes here are small, so let's not
get caught in that.
What's your suggestion on the thread pool? Should we merge the change
where vfio creates the threads on its own (assuming vfio maintainers are ok
with it)?
I would say no, that's what I suggested. I'd start with reusing
ThreadPool, then we found issue when Stefan reported worry on abusing the
API. All these discussions seem sensible to me so far. We can't rush on
these until we figure things out step by step. I don't see a way.
I saw Cedric suggesting to not even create a thread on recv side. I am not
sure whether that's easy, but I'd agree with Cedric if possible. I think
Maciej could have a point where it can block mutlifd threads, aka, IO
threads, which might be unwanted.
Sorry, If I am adding noise on this topic. I made this suggestion
because I spotted some asymmetry in the proposal.
The send and recv implementation in VFIO relies on different
interfaces with different level of complexity. The send part is
using a set of multifd callbacks called from multifd threads,
if I am correct. Whereas the recv part is directly implemented
in VFIO with local thread(s?) doing their own state receive cookery.
Yeh, the send/recv sides are indeed not fully symmetrical in the case of
multifd - the recv side is more IO-driven, e.g., QEMU reacts based on what
it receives (which was encoded in the headers of the received packets).
The src is more of a generic consumer / producer model where threads can
enqueue tasks / data to different iochannels.
Currently, the best case happens if both sides are I/O bound with respect
to the VFIO devices - reading device state from the source device as
fast as it can produce it and loading device state into the target device
as fast as it can consume it.
These devices aren't normally seriously network bandwidth constrained
here.
I was expecting a common interface to minimize assumptions on both
ends. It doesn't have to be callback based. It could be a set of
services a subsystem could use to transfer state in parallel.
<side note>
VFIO migration is driven by numerous callbacks and it is
difficult to understand the context in which these are called.
Adding more callbacks might not be the best approach.
</side note>
The other comment was on optimisation. If this is an optimisation
then I would expect, first, a non-optimized version not using threads
(on the recv side).
As commented in a previous email, I had a feeling that Maciej wanted to
avoid blocking multifd threads when applying VFIO data chunks to the kernel
driver, but Maciej please correct me.. I could be wrong.
Yes, we don't want the case that loading device state into one VFIO device
blocks loading such state into another VFIO device and so the second VFIO
device ends up being partially idle during that time.
To me I think I'm fine even if it blocks multifd threads, as it'll only
happen when with VFIO (we may want to consider n_multifd_threads to be
based on num of vfio devices then, so we still always have some idle
threads taking IOs out of the NIC buffers).
The current design uses exactly one loading thread per VFIO device.
So I agree with Cedric that if we can provide a functional working version
first then we can at least go with the simpler approach first.
VFIO Migration is a "new" feature which needs some more run-in.
That said, it is stable, MLX5 VFs devices have good support, you
can rely on me to evaluate the future respins.
Thanks,
C.
Thanks,
Maciej