Public bug reported: The following commit to oslo-incubator [1], that was supposed to optimize waiting for children processes to exit, will break neutron-server behavior (i.e. signal handling).
1. In neutron-server eventlet monkey-patching (including patching os module) is done in parent process. That is why os.waitpid(0, 0) in _wait_child method also gets monkey-patched and eventlet goes crazy. Connecting to parent process with strace shows that os.waitpid(0, os.WNOHANG) is called, yet it is difficult to say what is really happening because the process does not react on termination signals (SIGTERM, SIGHUP, SIGINT). 2. Due to the fact that neutron-server initializes two instances of ProcessLauncher in one parent process, calling eventlet.greenthread.sleep(self.wait_interval) seems to be the only way for the process to switch contexts and allow another instance of ProcessLauncher to call _wait_child. It is important to mention that ProcessLauncher is not supposed to be used in this way (2 instances in one parent process) at all. This bug is intended to track fixing the outlined problems on Neutron side. [1] https://github.com/openstack/oslo- incubator/commit/bf92010cc9d4c2876eaf6092713aafa94dcc8b35 ** Affects: neutron Importance: Undecided Assignee: Elena Ezhova (eezhova) Status: New ** Changed in: neutron Assignee: (unassigned) => Elena Ezhova (eezhova) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1438321 Title: Fix process management for neutron-server Status in OpenStack Neutron (virtual network service): New Bug description: The following commit to oslo-incubator [1], that was supposed to optimize waiting for children processes to exit, will break neutron-server behavior (i.e. signal handling). 1. In neutron-server eventlet monkey-patching (including patching os module) is done in parent process. That is why os.waitpid(0, 0) in _wait_child method also gets monkey-patched and eventlet goes crazy. Connecting to parent process with strace shows that os.waitpid(0, os.WNOHANG) is called, yet it is difficult to say what is really happening because the process does not react on termination signals (SIGTERM, SIGHUP, SIGINT). 2. Due to the fact that neutron-server initializes two instances of ProcessLauncher in one parent process, calling eventlet.greenthread.sleep(self.wait_interval) seems to be the only way for the process to switch contexts and allow another instance of ProcessLauncher to call _wait_child. It is important to mention that ProcessLauncher is not supposed to be used in this way (2 instances in one parent process) at all. This bug is intended to track fixing the outlined problems on Neutron side. [1] https://github.com/openstack/oslo- incubator/commit/bf92010cc9d4c2876eaf6092713aafa94dcc8b35 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1438321/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp