Hello,

I reviewed the code path and upgrade in my reproducer, following the approach
of upgrading neutron-gateway and subsequently neutron-api doesn't works because 
of a mismatch
in the migrations/rpc versions that causes the ha port to fail to be 
created/updated,
then the keepalived process cannot be spawned and finally the 
state-change-monitor
fails to find the PID for that keepalived process.  

If I upgrade neutron-api, run the migrations to head and then upgrade
the gateways, all seems correct.

I upgraded from the following versions

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep keepalived
ii  keepalived                           1:1.3.9-1ubuntu0.18.04.2               
                     amd64        Failover and monitoring daemon for LVS 
clusters

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep neutron-common
ii  neutron-common                       2:15.3.3-0ubuntu1~cloud0               
                     all          Neutron is a virtual network service for 
Openstack - common

--> To

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep neutron-common
ii  neutron-common                       2:16.3.2-0ubuntu3~cloud0               
                     all          Neutron is a virtual network service for 
Openstack - common


I created a router with HA enabled as follows


$ openstack router list
+--------------------------------------+-----------------+--------+-------+----------------------------------+-------------+------+
| ID                                   | Name            | Status | State | 
Project                          | Distributed | HA   |
+--------------------------------------+-----------------+--------+-------+----------------------------------+-------------+------+
| 09fa811f-410c-4360-8cae-687e7e73ff21 | provider-router | ACTIVE | UP    | 
6f5aaf5130764305a5d37862e3ff18ce | False       | True |
+--------------------------------------+-----------------+--------+-------+----------------------------------+-------------+------+


===> Prior to upgrade I can list the keepalived processed linked to the 
ha-router

root     22999  0.0  0.0  91816  3052 ?        Ss   19:17   0:00
keepalived -P -f /var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-
687e7e73ff21/keepalived.conf -p /var/lib/neutron/ha_confs/09fa811f-
410c-4360-8cae-687e7e73ff21.pid.keepalived -r /var/lib/neutron/ha_confs
/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived-vrrp -D

root     23001  0.0  0.1  92084  4088 ?        S    19:17   0:00
keepalived -P -f /var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-
687e7e73ff21/keepalived.conf -p /var/lib/neutron/ha_confs/09fa811f-
410c-4360-8cae-687e7e73ff21.pid.keepalived -r /var/lib/neutron/ha_confs
/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived-vrrp -D


===> After upgrading -- None is returned, and in fact the keepalived processes 
aren't spawned
after neutron-* is upgraded.

Pre-upgrade:
Jun 24 19:17:07 juju-da864d-1927868-5 Keepalived[22997]: Starting Keepalived 
v1.3.9 (10/21,2017)
Jun 24 19:17:07 juju-da864d-1927868-5 Keepalived[22999]: Starting VRRP child 
process, pid=23001

Post - upgrade -- Not started

Jun 24 19:30:41 juju-da864d-1927868-5 Keepalived[22999]: Stopping
Jun 24 19:30:42 juju-da864d-1927868-5 Keepalived_vrrp[23001]: Stopped
Jun 24 19:30:42 juju-da864d-1927868-5 Keepalived[22999]: Stopped Keepalived 
v1.3.9 (10/21,2017)

The reason for those keepalived processes not re-spawned is

1) The ml2 process starts the router devices by requesting a rpc call on the 
device details. This
one fails with different oslo target versions.

Therefore is required for the neutron-api migrations to be applied
before the gateways.

9819:2021-06-24 19:31:09.935 31744 DEBUG
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] Starting to process
devices in:{'current': {'87cfdd45-fea7-4c06-aa13-174cb71b294f',
'b8e18ba0-c65b-498e-9a8b-34c0fcc42d07',
'926b7377-30f4-4b2c-9064-8aab3918a385'}, 'added':
{'87cfdd45-fea7-4c06-aa13-174cb71b294f'}, 'removed': set(), 'updated':
set(), 're_added': set()} rpc_loop /usr/lib/python3/dist-
packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:2685

9821:2021-06-24 19:31:10.028 31744 ERROR neutron.agent.rpc [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] Failed to get details
for device 87cfdd45-fea7-4c06-aa13-174cb71b294f:
oslo_messaging.rpc.client.RemoteError: Remote error:
InvalidTargetVersion Invalid target version 1.1

9869:2021-06-24 19:31:10.510 31744 DEBUG
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] retrying failed devices
{'87cfdd45-fea7-4c06-aa13-174cb71b294f'}
_update_port_info_failed_devices_stats /usr/lib/python3/dist-
packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1674

2)  Then the l3 ha router creation mechanism can't process the HA router 
because the HA port id 87cfdd45-fea7-4c06-aa13-174cb71b294f is down 
and keepalived cannot be spawned [0] [1]

[0] 
https://github.com/openstack/neutron/blob/1ad9ca56b07ffdc9f7e0bc6a62af61961b9128eb/neutron/agent/l3/ha_router.py#L519
[1] 
https://github.com/openstack/neutron/blob/1ad9ca56b07ffdc9f7e0bc6a62af61961b9128eb/neutron/agent/linux/keepalived.py#L455

1971:2021-06-24 19:31:15.034 32459 DEBUG neutron.agent.l3.ha_router [-]
Processing HA router with HA port: {'id':
'87cfdd45-fea7-4c06-aa13-174cb71b294f', 'name': 'HA port tenant
6f5aaf5130764305a5d37862e3ff18ce', 'network_id':
'1a2e73c3-1587-4417-be96-40fde935474b', 'tenant_id': '', 'mac_address':
'fa:16:3e:e2:e0:56', 'admin_state_up': True, 'status': 'DOWN',
'device_id': '09fa811f-410c-4360-8cae-687e7e73ff21', 'device_owner':
'network:router_ha_interface', 'fixed_ips': [{'subnet_id': '6f8bfdbf-
ca04-4847-ac83-f4bd90c089b6', 'ip_address': '169.254.193.135',
'prefixlen': 18}], 'allowed_address_pairs': [], 'extra_dhcp_opts': [],
'security_groups': [], 'description': '', 'binding:vnic_type': 'normal',
'binding:profile': {}, 'binding:host_id': 'juju-da864d-1927868-5',
'binding:vif_type': 'ovs', 'binding:vif_details': {'connectivity': 'l2',
'port_filter': True, 'ovs_hybrid_plug': True, 'datapath_type': 'system',
'bridge_name': 'br-int'}, 'port_security_enabled': False, 'dns_name':
'', 'dns_assignment': [{'ip_address': '169.254.193.135', 'hostname':
'host-169-254-193-135', 'fqdn':
'host-169-254-193-135.1927868.stsstack.qa.1ss.'}], 'dns_domain': '',
'ip_allocation': 'immediate', 'tags': [], 'created_at':
'2021-06-24T19:16:35Z', 'updated_at': '2021-06-24T19:30:59Z',
'revision_number': 5, 'project_id': '', 'subnets': [{'id': '6f8bfdbf-
ca04-4847-ac83-f4bd90c089b6', 'cidr': '169.254.192.0/18', 'gateway_ip':
None, 'dns_nameservers': [], 'ipv6_ra_mode': None, 'subnetpool_id':
None}], 'extra_subnets': [], 'address_scopes': {'4': None, '6': None},
'mtu': 1500} process /usr/lib/python3/dist-
packages/neutron/agent/l3/ha_router.py:513


3) Since the port is down, the keepalived process cannot be  started, the 
'neutron-keepalived-state-change' agent fails with:


11166:2021-06-24 20:12:53.600 8839 DEBUG neutron.agent.linux.utils [-] Running 
command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 
'neutron-keepalived-state-change', 
'--router_id=09fa811f-410c-4360-8cae-687e7e73ff21', 
'--namespace=qrouter-09fa811f-410c-4360-8cae-687e7e73ff21', 
'--conf_dir=/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21', 
'--log-file=/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21/neutron-keepalived-state-change.log',
 '--monitor_interface=ha-87cfdd45-fe', '--monitor_cidr=169.254.0.203/24', 
'--pid_file=/var/lib/neutron/external/pids/09fa811f-410c-4360-8cae-687e7e73ff21.monitor.pid.neutron-keepalived-state-change-monitor',
 '--state_path=/var/lib/neutron', '--user=113', '--group=117'] create_process 
/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py:88
11167:2021-06-24 20:12:55.379 8839 DEBUG neutron.agent.l3.ha_router [-] Router 
09fa811f-410c-4360-8cae-687e7e73ff21 neutron-keepalived-state-change-monitor 
pid 8961 spawn_state_change_monitor 
/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py:428
11182:2021-06-24 20:12:55.611 8839 DEBUG neutron.agent.linux.utils [-] Unable 
to access 
/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived; 
Error: [Errno 2] No such file or directory: 
'/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived' 
get_value_from_file 
/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py:263
11214:2021-06-24 20:12:56.172 8839 DEBUG neutron.agent.linux.utils [-] Unable 
to access 
/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived; 
Error: [Errno 2] No such file or directory: 
'/var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived' 
get_value_from_file 
/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py:263

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1927868

Title:
  vRouter not working after update to 16.3.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1927868/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to