Hi, after a good discussion a few days ago in https://www.redhat.com/archives/libvir-list/2018-August/msg00122.html and a short lived but back then untested v2 in https://www.redhat.com/archives/libvir-list/2018-August/msg00199.html I finally get access to the right HW again and completed the series.
Being finally retested and working I finally feel safe to submit without a RFC prefix. I think this would be a great addition for a better handling of guests with plenty of host devices passed through. With the new code in place I can shutdown systems that have 12, 16 or even more hostdevs attached without getting into the "zombie" mode where libvirt will forever consider the guest as "in shutdown" as it gave up waiting too early because the signal zero still was able to reach it. Scaling examples (extracted with gdb): 16 Devices: virProcessKillPainfullyDelay (pid=67096, force=true, extradelay=32) 12 Devices: virProcessKillPainfullyDelay (pid=68251, force=true, extradelay=24) *Updates in v4* - virDebug now reports the extradelay as requested (in seconds) and thereby mostly matches the gdb output seen above - header function prototype defines the variable name - clarify the usage of delay units - seconds (API call) - 5th of seconds (internal poll loop) - explain the request for 2*nhostdevs from the qemu shutdown code *Updates in v3* - fixup some issues found in testing and code checks *Updates in v2* - removed the "accept the lack of /proc/<pid> as valid process removal" approach due to valid concerns about reusing ressources. - added a dynamic extra wait scaling with the amount of hostdevs Christian Ehrhardt (2): process: wait longer on kill per assigned Hostdev process: wait longer 5->30s on hard shutdown src/libvirt_private.syms | 1 + src/qemu/qemu_process.c | 7 +++++-- src/util/virprocess.c | 22 ++++++++++++++++++---- src/util/virprocess.h | 3 +++ 4 files changed, 27 insertions(+), 6 deletions(-) -- 2.17.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list