Peter Xu <pet...@redhat.com> writes: > On Tue, Jul 16, 2024 at 10:10:12PM +0200, Maciej S. Szmigiero wrote: >> On 27.06.2024 16:56, Peter Xu wrote: >> > On Thu, Jun 27, 2024 at 11:14:28AM +0200, Maciej S. Szmigiero wrote: >> > > On 26.06.2024 18:23, Peter Xu wrote: >> > > > On Wed, Jun 26, 2024 at 05:47:34PM +0200, Maciej S. Szmigiero wrote: >> > > > > On 26.06.2024 03:51, Peter Xu wrote: >> > > > > > On Wed, Jun 26, 2024 at 12:44:29AM +0200, Maciej S. Szmigiero >> > > > > > wrote: >> > > > > > > On 25.06.2024 19:25, Peter Xu wrote: >> > > > > > > > On Mon, Jun 24, 2024 at 09:51:18PM +0200, Maciej S. Szmigiero >> > > > > > > > wrote: >> > > > > > > > > Hi Peter, >> > > > > > > > >> > > > > > > > Hi, Maciej, >> > > > > > > > >> > > > > > > > > >> > > > > > > > > On 23.06.2024 22:27, Peter Xu wrote: >> > > > > > > > > > On Tue, Jun 18, 2024 at 06:12:18PM +0200, Maciej S. >> > > > > > > > > > Szmigiero wrote: >> > > > > > > > > > > From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com> >> > > > > > > > > > > >> > > > > > > > > > > This is an updated v1 patch series of the RFC (v0) >> > > > > > > > > > > series located here: >> > > > > > > > > > > https://lore.kernel.org/qemu-devel/cover.1713269378.git.maciej.szmigi...@oracle.com/ >> > > > > > > > > > >> > > > > > > > > > OK I took some hours thinking about this today, and here's >> > > > > > > > > > some high level >> > > > > > > > > > comments for this series. I'll start with which are more >> > > > > > > > > > relevant to what >> > > > > > > > > > Fabiano has already suggested in the other thread, then >> > > > > > > > > > I'll add some more. >> > > > > > > > > > >> > > > > > > > > > https://lore.kernel.org/r/20240620212111.29319-1-faro...@suse.de >> > > > > > > > > >> > > > > > > > > That's a long list, thanks for these comments. >> > > > > > > > > >> > > > > > > > > I have responded to them inline below. >> > > > > > > > > (..) >> > > > > > > >> > > > > > > 2) Submit this operation to the thread pool and wait for it to >> > > > > > > complete, >> > > > > > >> > > > > > VFIO doesn't need to have its own code waiting. If this pool is >> > > > > > for >> > > > > > migration purpose in general, qemu migration framework will need >> > > > > > to wait at >> > > > > > some point for all jobs to finish before moving on. Perhaps it >> > > > > > should be >> > > > > > at the end of the non-iterative session. >> > > > > >> > > > > So essentially, instead of calling save_live_complete_precopy_end >> > > > > handlers >> > > > > from the migration code you would like to hard-code its current VFIO >> > > > > implementation of calling >> > > > > vfio_save_complete_precopy_async_thread_thread_terminate(). >> > > > > >> > > > > Only it wouldn't be then called VFIO precopy async thread terminate >> > > > > but some >> > > > > generic device state async precopy thread terminate function. >> > > > >> > > > I don't understand what did you mean by "hard code". >> > > >> > > "Hard code" wasn't maybe the best expression here. >> > > >> > > I meant the move of the functionality that's provided by >> > > vfio_save_complete_precopy_async_thread_thread_terminate() in this patch >> > > set >> > > to the common migration code. >> > >> > I see. That function only does a thread_join() so far. >> > >> > So can I understand it as below [1] should work for us, and it'll be clean >> > too (with nothing to hard-code)? >> >> It will need some signal to the worker threads pool to terminate before >> waiting for them to finish (as the code in [1] just waits). >> >> In the case of current vfio_save_complete_precopy_async_thread() >> implementation, >> this signal isn't necessary as this thread simply terminates when it has read >> all the date it needs from the device. >> >> In a worker threads pool case there will be some threads waiting for >> jobs to be queued to them and so they will need to be somehow signaled >> to exit. > > Right. We may need something like multifd_send_should_exit() + > MultiFDSendParams.sem. It'll be nicer if we can generalize that part so > multifd threads can also rebase to that thread model, but maybe I'm asking > too much. > >> >> > The time to join() the worker threads can be even later, until >> > migrate_fd_cleanup() on sender side. You may have a better idea on when >> > would be the best place to do it when start working on it. >> > >> > > >> > > > What I was saying is if we target the worker thread pool to be used for >> > > > "concurrently dump vmstates", then it'll make sense to make sure all >> > > > the >> > > > jobs there were flushed after qemu dumps all non-iterables (because >> > > > this >> > > > should be the last step of the switchover). >> > > > >> > > > I expect it looks like this: >> > > > >> > > > while (pool->active_threads) { >> > > > qemu_sem_wait(&pool->job_done); >> > > > } >> > >> > [1] >> > >> (..) >> > > I think that with this thread pool introduction we'll unfortunately >> > > almost certainly >> > > need to target this patch set at 9.2, since these overall changes (and >> > > Fabiano >> > > patches too) will need good testing, might uncover some performance >> > > regressions >> > > (for example related to the number of buffers limit or Fabiano multifd >> > > changes), >> > > bring some review comments from other people, etc. >> > > >> > > In addition to that, we are in the middle of holiday season and a lot of >> > > people >> > > aren't available - like Fabiano said he will be available only in a few >> > > weeks. >> > >> > Right, that's unfortunate. Let's see, but still I really hope we can also >> > get some feedback from Fabiano before it lands, even with that we have >> > chance for 9.1 but it's just challenging, it's the same condition I >> > mentioned since the 1st email. And before Fabiano's back (he's the active >> > maintainer for this release), I'm personally happy if you can propose >> > something that can land earlier in this release partly. E.g., if you want >> > we can at least upstream Fabiano's idea first, or some more on top. >> > >> > For that, also feel to have a look at my comment today: >> > >> > https://lore.kernel.org/r/Zn15y693g0AkDbYD@x1n >> > >> > Feel free to comment there too. There's a tiny uncertainty there so far on >> > specifying "max size for a device state" if do what I suggested, as multifd >> > setup will need to allocate an enum buffer suitable for both ram + device. >> > But I think that's not an issue and you'll tackle that properly when >> > working on it. It's more about whether you agree on what I said as a >> > general concept. >> > >> >> Since it seems that the discussion on Fabiano's patch set has subsided I >> think >> I will start by basing my updated patch set on top of his RFC and then if >> Fabiano wants to submit v1/v2 of his patch set then I will rebase mine on top >> of it. >> >> Otherwise, you can wait until I have a v2 ready and then we can work with >> that. > > Oh I thought you already started modifying his patchset. > > In this case, AFAIR Fabiano has plan to rework that RFC series, so maybe > you want to double check with him, and can also wait for his new version if > that's easier, because I do expect there'll be major changes. > > Fabiano?
Don't wait on me. I think I can make the changes Peter suggested without affecting too much the interfaces used by this series. If it comes to it, I can rebase this series "under" Maciej's.