@james-page, @axino -

Just a +1 to the HA property being changed requiring the router to be
set down prior, and back up after to start the recreation of the router
as HA.

We have seen various other side effects in Neutron/OVS environments and 
specifically the environment in question, such as -
* Missing interfaces inside qrouter namespaces (OVS taps)
* Missing iptables rules 
* Missing floating IP aliases on OVS interfaces inside the qrouter namespaces
All of which are tasks which are performed during bringup of HA routers. We 
have seen fewer of these issues on non-HA routers, and whether the router is HA 
or not, rescheduling the router or converting from HA to non-HA or vice versa 
will rebuild and as a result repair the router.

I should also point out that at the time of these issues, we have rarely
observed high system load, but I do also agree that the number of
routers and therefore the workload on both Neutron and OVS to
orchestrate interface plugging and unplugging and namespace (and
associated network stack plumbing) work is much higher than a typical
environment. Having three servers doing this work rather than scaling
horizontally seems like it might be exposing bottlenecks in either
Neutron or OVS when it comes to the orchestration of these tasks.

I'm not sure if you are seeing the following traceback in the logs
provided, but the below traceback has also been common when this issue
crops up, and shows an example of a task performed during the bringup of
a router (the IPTablesManager initialisation) falling over.

2018-02-14 05:04:32.101 1352665 DEBUG neutron.agent.linux.utils [-] Exit code: 
0 execute /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:158
2018-02-14 05:04:32.103 1352665 DEBUG neutron.agent.linux.iptables_manager [-] 
IPTablesManager.apply completed with success. 0 iptables commands were issued 
_apply_synchronized 
/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py:576
2018-02-14 05:04:32.103 1352665 DEBUG oslo_concurrency.lockutils [-] Releasing 
semaphore "iptables-qrouter-43801324-72ce-469f-a628-a5c645041e30" lock 
/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:228
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info [-] 
'NoneType' object has no attribute 'remove_vip_by_ip_address'
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info Traceback 
(most recent call last):
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 253, in call
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     return 
func(*args, **kwargs)
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 1115, 
in process
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     
self.process_external()
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 890, 
in process_external
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     
self._process_external_gateway(ex_gw_port)
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 777, 
in _process_external_gateway
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     
self.external_gateway_updated(ex_gw_port, interface_name)
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/ha_router.py", line 403, in 
external_gateway_updated
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     
self._remove_vip(old_gateway_cidr)
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/ha_router.py", line 202, in 
_remove_vip
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info     
instance.remove_vip_by_ip_address(ip_cidr)
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info 
AttributeError: 'NoneType' object has no attribute 'remove_vip_by_ip_address'
2018-02-14 05:04:32.103 1352665 ERROR neutron.agent.l3.router_info 
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router: 43801324-72ce-469f-a628-a5c645041e30
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 517, in 
_process_router_update
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self._process_router_if_compatible(router)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 454, in 
_process_router_if_compatible
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self._process_updated_router(router)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 469, in 
_process_updated_router
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     ri.process()
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/ha_router.py", line 426, in 
process
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
super(HaRouter, self).process()
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 256, in call
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     self.logger(e)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self.force_reraise()
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
six.reraise(self.type_, self.value, self.tb)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 253, in call
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     return 
func(*args, **kwargs)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 1115, 
in process
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self.process_external()
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 890, 
in process_external
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self._process_external_gateway(ex_gw_port)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 777, 
in _process_external_gateway
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self.external_gateway_updated(ex_gw_port, interface_name)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/ha_router.py", line 403, in 
external_gateway_updated
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
self._remove_vip(old_gateway_cidr)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/ha_router.py", line 202, in 
_remove_vip
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent     
instance.remove_vip_by_ip_address(ip_cidr)
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent AttributeError: 
'NoneType' object has no attribute 'remove_vip_by_ip_address'
2018-02-14 05:04:32.104 1352665 ERROR neutron.agent.l3.agent

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1749425

Title:
  Neutron integrated with OpenVSwitch drops packets and fails to
  plug/unplug interfaces from OVS on router interfaces at scale

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1749425/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to