Reviewed: https://review.openstack.org/287426 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=32d576427179965def0a1757dc86a492e1c3756d Submitter: Jenkins Branch: master
commit 32d576427179965def0a1757dc86a492e1c3756d Author: Tim Pownall <tim.pown...@rackspace.com> Date: Wed Mar 2 14:38:24 2016 -0600 xenapi: fix when tar exits early during download When downloading a streamed chunk from swift through glance gives tar invalid input, tar will exit, but leave the plugin still trying to write data to the dead process. To stop this we can spot when the tar process exits early, and be sure to stop trying to write more data to the dead process. Change-Id: Ic2bc89fa6d08db505b044a9498c1bfa5b884a056 Closes-Bug: 1552293 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1552293 Title: glance plugin not exiting when tar exits from bad data being passed. Status in OpenStack Compute (nova): Fix Released Bug description: This bug only applies to openstack users running xenapi in combination with glance plugin to fetch images from swift. There is logic in utils.py to kill the tar process when python hits an exception from subprocess during the tar extraction process in the xenapi plugin glance. This logic does work when tar exits cleanly with an EOF on a truncated file, however if you append bad data to that truncated file (malformed http response), tar will die and re spawn outside of the child/parent process tree of the xen fork executioner daemon and continue to read from that stdin pipe. The most minimal code change to be made is to poll for the process during extraction function which after making this change tar will no longer hang like this and even though tar is now its own process, when glance plugin reclaims the defunct tar it closes the pipe which kills the new tar process under ppid 1 (init) as that tar process is trying to read from that pipe. When looking at the code it appears we don't poll for the process and expect tar to exit in a way in which python will get an exception, however this is not always the case. Below is what it looks like when this issue occurs: root 8750 10.6 0.7 9752 5836 ? Ss 23:44 0:06 \_ python /etc/xapi.d/plugins/glance <methodCall><methodName>download_vhd2</methodName><params><param><value>OpaqueRef:a9b81f62-0281-2301-1eec-754fd2f1a057</value></ root 8820 5.4 0.0 0 0 ? Z 23:45 0:03 \_ [tar] <defunct> root 8829 1.7 0.0 2552 464 ? S 23:45 0:01 tar -zx --directory=/var/run/sr-mount/637c4bf0-3cf6-b283-66f0-7087dec0439e/tmpIbRPA1 root 8830 18.9 0.0 0 0 ? Z 23:45 0:11 \_ [gzip] <defunct> [root@# ls -la fd/* l-wx------ 1 root root 64 Mar 1 23:45 fd/0 -> /dev/null l-wx------ 1 root root 64 Mar 1 23:45 fd/1 -> /tmp/execute_command_get_outc14016.log l-wx------ 1 root root 64 Mar 1 23:45 fd/2 -> /tmp/execute_command_get_errf62557.log lrwx------ 1 root root 64 Mar 1 23:45 fd/3 -> socket:[718064936] lrwx------ 1 root root 64 Mar 1 23:45 fd/4 -> socket:[718064938] l-wx------ 1 root root 64 Mar 1 23:45 fd/6 -> pipe:[718065596] lr-x------ 1 root root 64 Mar 1 23:45 fd/7 -> pipe:[718065597] [root@]# ls -la ../8829/fd/* lr-x------ 1 root root 64 Mar 1 23:46 ../8829/fd/0 -> pipe:[718065596] l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/1 -> pipe:[718065612] l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/2 -> pipe:[718065597] To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1552293/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp