Public bug reported: * High level description
Occasionally, deleting a router will cause the VPN Agent to get stuck in a loop trying to tear down a deleted router. * Pre-conditions Fault seems random, only pre-conditions are: router delete * Step-by-step reproduction steps Delete a router. There is a chance that the VPN Agent will get stuck. * Expected output VPN agent removes router. * Actual output VPN Agent spins in a tight loop, attempting to execute ip netns exec qrouter-e14edfa6-a3e1-4866-8a1a-ee6ecf0f4a67 find /sys/class/net -maxdepth 1 -type l -printf %f This command fails because the namespace does not exist. The VPN agent immediately attempts the same command. Neutron stack traces accumulate and fill up disks *very* quickly. * Version Neutron VPN Agent 7.0.4.3 Ubuntu 14.04.3 4.2 kernel * Perceived severity Active Production issue, wakes Ops staff up from time to time. * Logs are attached, but basically this stack trace happens every 100 ms. 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent [-] Error while deleting router 69e961ca-6b64-4085-833a-7796b2fce233 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 346, in _safe_router_removed 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self._router_removed(router_id) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_removed 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ri.delete(self) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 270, in delete 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.process_delete(agent) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.logger(e) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__ 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.force_reraise() 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in force_reraise 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py", line 356, in call 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent return func(*args, **kwargs) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 727, in process_delete 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self._process_internal_ports(agent.pd) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 444, in _process_internal_ports 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent existing_devices = self._get_existing_devices() 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 336, in _get_existing_devices 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ip_devs = ip_wrapper.get_devices(exclude_loopback=True) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 125, in get_devices 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent log_fail_as_error=self.log_fail_as_error 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent raise RuntimeError(m) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent RuntimeError: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-69e961ca-6b64-4085-833a-7796b2fce233', 'find', '/sys/class/net', '-maxdepth', '1', '-type', 'l', '-printf', '%f '] 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Exit code: 1 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stdin: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stdout: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stderr: Cannot open network namespace "qrouter-69e961ca-6b64-4085-833a-7796b2fce233": No such file or directory ** Affects: neutron Importance: Undecided Status: New ** Tags: vpnaas ** Attachment added: "l3 aggent error log" https://bugs.launchpad.net/bugs/1605046/+attachment/4704339/+files/l3-agent-errors.log -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1605046 Title: Router delete causes L3 Agent hang Status in neutron: New Bug description: * High level description Occasionally, deleting a router will cause the VPN Agent to get stuck in a loop trying to tear down a deleted router. * Pre-conditions Fault seems random, only pre-conditions are: router delete * Step-by-step reproduction steps Delete a router. There is a chance that the VPN Agent will get stuck. * Expected output VPN agent removes router. * Actual output VPN Agent spins in a tight loop, attempting to execute ip netns exec qrouter-e14edfa6-a3e1-4866-8a1a-ee6ecf0f4a67 find /sys/class/net -maxdepth 1 -type l -printf %f This command fails because the namespace does not exist. The VPN agent immediately attempts the same command. Neutron stack traces accumulate and fill up disks *very* quickly. * Version Neutron VPN Agent 7.0.4.3 Ubuntu 14.04.3 4.2 kernel * Perceived severity Active Production issue, wakes Ops staff up from time to time. * Logs are attached, but basically this stack trace happens every 100 ms. 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent [-] Error while deleting router 69e961ca-6b64-4085-833a-7796b2fce233 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 346, in _safe_router_removed 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self._router_removed(router_id) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_removed 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ri.delete(self) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 270, in delete 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.process_delete(agent) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.logger(e) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__ 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.force_reraise() 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in force_reraise 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py", line 356, in call 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent return func(*args, **kwargs) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 727, in process_delete 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self._process_internal_ports(agent.pd) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 444, in _process_internal_ports 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent existing_devices = self._get_existing_devices() 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 336, in _get_existing_devices 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ip_devs = ip_wrapper.get_devices(exclude_loopback=True) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 125, in get_devices 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent log_fail_as_error=self.log_fail_as_error 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent File "/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent raise RuntimeError(m) 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent RuntimeError: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-69e961ca-6b64-4085-833a-7796b2fce233', 'find', '/sys/class/net', '-maxdepth', '1', '-type', 'l', '-printf', '%f '] 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Exit code: 1 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stdin: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stdout: 2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Stderr: Cannot open network namespace "qrouter-69e961ca-6b64-4085-833a-7796b2fce233": No such file or directory To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1605046/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp