Daniel P. Berrangé <berra...@redhat.com> wrote: > On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote: >> When we have lots of channels, sometimes multifd migration fails >> with the following error: >> after some time, sending side decides to send another packet through >> that channel, and it is now when we get the above error. >> >> Any good ideas? > > In inet_listen_saddr() we call > > if (!listen(slisten, 1)) { > > note the second parameter sets the socket backlog, which is the max > number of pending socket connections we allow. My guess is that the > target QEMU is not accepting incoming connections quickly enough and > thus you hit the limit & the kernel starts dropping the incoming > connections. > > As a quick test, just hack this code to pass a value of 100 and see > if it makes your test reliable. If it does, then we'll need to figure > out a nice way to handle backlog instead of hardcoding it at 1.
I will test. But notice that the qemu_connect() on source side says that things went right. It is the destination what is *not* calling the callback. Or at least that is what I think it is happening. Later, Juan.