Daniel P. Berrangé <berra...@redhat.com> wrote: > On Thu, May 18, 2023 at 09:13:58AM +0000, Wang, Wei W wrote: >> On Thursday, May 18, 2023 4:52 PM, Wang, Lei4 wrote: >> > When destination VM is launched, the "backlog" parameter for listen() is >> > set >> > to 1 as default in socket_start_incoming_migration_internal(), which will >> > lead to socket connection error (the queue of pending connections is full) >> > when "multifd" and "multifd-channels" are set later on and a high number of >> > channels are used. Set it to a hard-coded higher default value 512 to fix >> > this >> > issue. >> > >> > Reported-by: Wei Wang <wei.w.w...@intel.com> >> > Signed-off-by: Lei Wang <lei4.w...@intel.com> >> > --- >> > migration/socket.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/migration/socket.c b/migration/socket.c index >> > 1b6f5baefb..b43a66ef7e 100644 >> > --- a/migration/socket.c >> > +++ b/migration/socket.c >> > @@ -179,7 +179,7 @@ >> > socket_start_incoming_migration_internal(SocketAddress *saddr, >> > QIONetListener *listener = qio_net_listener_new(); >> > MigrationIncomingState *mis = migration_incoming_get_current(); >> > size_t i; >> > - int num = 1; >> > + int num = 512; >> > >> >> Probably we need a macro for it, e.g. >> #define MIGRATION_CHANNEL_MAX 512 >> >> Also, I think below lines could be removed, as using a larger value of num >> (i.e. 512) >> doesn't seem to consume more resources anywhere: >> - if (migrate_use_multifd()) { >> - num = migrate_multifd_channels(); >> - } else if (migrate_postcopy_preempt()) { >> - num = RAM_CHANNEL_MAX; >> - } > > Given that this code already exists, why is it not already sufficient ?
Ah, I "think" I remember now. > The commit description is saying we're setting backlog == 1 wit > multifd, but this later code is setting it to match the multfd > channels. Why isn't that enough ? Are you using -incoming defer? No? right. With multifd, you should use -incoming defer. It is more, you should use -incoming defer always. The problem is that the way qemu starts, when we do the initial listen, the parameters migration_channels and migration_multifd hasn't yet been parsed. Can you confirm that if you start with -incoming defer, everything works as expected? Later, Juan.