On 02.06.2016 10:34, Dietmar Maurer wrote:
I do not really understand this loop.

* Why do you call kill -9 multiple times?


"Just to be sure", normally the -9 would instantly kill it and the next loop 
iteration would then pick it up, so the probability that a another sigkill gets send is 
quite low.
(but yeah, the code so is bad style/confusing I guess)

* Why do you iterate 20 times (instead of 30)?


The migrations is here at an end, succeeded or not,
but if the tunnel is still here at this point we want to quit it,
waiting 30 seconds seems long for that, as the tunnel has no use now, as:

* all data was carried over to the destination
* the migration failed the VM stays on the source and no more data gets send 
over the tunnel.

I'd maybe actually go for 5 then a sigterm and after then seconds a sigkill if 
its still there (which is really low probability and it has no effect on our 
migration anyway).

But as it also does not really hinders us we can use the old timeouts and send 
a sigterm at 15 seconds and a sigkill after 30 if preferred.

I'll resend the whole thing (mainly this patch and patch 2) where I address the 
here mentioned issue and that also old versions of qemu-server may live migrate 
(should not be to much code overhead).

+    # collect child process
+    for (my $i = 1; $i < 20; $i++) {
+       my $waitpid = waitpid($cpid, WNOHANG);
+       last if (defined($waitpid) && ($waitpid == $cpid));
+
+       if ($i == 10) {
+           $self->log('info', "ssh tunnel still running - terminating now with
SIGTERM");
+           kill(15, $cpid);
+       } elsif ($i >= 15) {
+           $self->log('info', "ssh tunnel still running - terminating now with
SIGKILL");
+           kill(9, $cpid);
        }
+       sleep (1);
     }

_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to