Hi, recently we hit an issue in Neutron with tests getting stuck [1]. As a side effect we discovered logs are not collected properly which makes it hard to find the root cause. The reason of missing logs is that we send SIGKILL to whatever gate hook is running when we hit the global timeout per gate job [2]. This gives no time to running process to perform any post-processing. In post_gate_hook function in Neutron, we collect logs from /tmp directory, compress them and move them to /opt/stack/logs to make them exposed.
I have in mind two solutions to which I'd like to get feedback before sending patches. 1) In Neutron, we execute tests in post_gate_hook (dunno why). But even if we would have moved test execution into gate_hook and tests get stuck then the post_gate_hook won't be triggered [3]. So the solution I propose here is to terminate gate_hook N minutes before global timeout and still execute post_gate_hook (with timeout) as post-processing routine. 2) Second proposal is to let timeout wrapped commands know they are about to be killed. We can send let's say SIGTERM instead of SIGKILL and after certain amount of time, send SIGKILL. Example: We send SIGTERM 3 minutes before global timeout, letting these 3 minutes to 'command' to handle the SIGTERM signal. timeout -s 15 -k 3 $((REMAINING_TIME-3))m bash -c "command" With the 2nd approach we can trap the signal that kills running test suite and collects logs with same functions we currently have. I would personally go with second option but I want to hear if anybody has a better idea about post processing in gate jobs or if there is already a tool we can use to collect logs. Thanks, Kuba [1] https://bugs.launchpad.net/bugs/1567668 [2] https://github.com/openstack-infra/devstack-gate/blob/master/functions.sh#L1151 [3] https://github.com/openstack-infra/devstack-gate/blob/master/devstack-vm-gate-wrap.sh#L581 __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev