On 05/31/2016 07:29 PM, Dietmar Maurer wrote:
>> Further another problem would still be open if we tried to patch the
>> SSH Forward method we currently use - which we solve for free with
>> the approach of this patch - namely the problem that the method
>> to get an available port (next_migration_port) has a serious race
>> condition 
> Why is there a race condition exactly? If so, we have to fix that.

It's not directly in next_unused_port as this is flock'ed, but if the
program which
requests a port needs to long to open it, it may be seen as timeout-ed
in next_unused_port
and another program gets assigned the same port, then both may try to
open/connect to it.

As we did not have the SSH options ExitOnForwardFailure enabled the
second migrations ssh tunnel
trying to bind to the local port did not failed when it couldn't and
qemu the writes also to a port
where it gets a connection refused (as the other migration is running on
it).

This may give also troubles for other programs using (indirectly)
next_unused_port,
but at least the race condition should trigger really seldom,
I'll look into a way to fix that after the v3 of this patches.

_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to