Reviewed: https://review.openstack.org/259485 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=17c14977ce0e2291e911739f8c85838f1c1f3473 Submitter: Jenkins Branch: master
commit 17c14977ce0e2291e911739f8c85838f1c1f3473 Author: James Page <james.p...@ubuntu.com> Date: Fri Dec 18 15:02:11 2015 +0000 Ensure that tunnels are fully reset on ovs restart When the l2population mechanism driver is enabled, if ovs is restarted tunnel ports are not re-configured in full due to stale ofport handles in the OVS agent. Reset all handles when OVS is restarted to ensure that tunnels are fully recreated in this situation. Change-Id: If0e034a034a7f000a1c58aa8a43d2c857dee6582 Closes-bug: #1460164 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1460164 Title: restart of openvswitch-switch causes instance network down when l2population enabled Status in neutron: Fix Released Status in neutron package in Ubuntu: Triaged Bug description: On 2015-05-28, our Landscape auto-upgraded packages on two of our OpenStack clouds. On both clouds, but only on some compute nodes, the upgrade of openvswitch-switch and corresponding downtime of ovs-vswitchd appears to have triggered some sort of race condition within neutron-plugin-openvswitch-agent leaving it in a broken state; any new instances come up with non-functional network but pre-existing instances appear unaffected. Restarting n-p-ovs-agent on the affected compute nodes is sufficient to work around the problem. The packages Landscape upgraded (from /var/log/apt/history.log): Start-Date: 2015-05-28 14:23:07 Upgrade: nova-compute-libvirt:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-login0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), nova-compute-kvm:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), systemd-services:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-common:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), nova-common:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), python-nova:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-daemon0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), libpam-systemd:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), udev:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub2-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), openvswitch-switch:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2), libudev1:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-client:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), python-eventlet:amd64 (0.13.0-1ubuntu2, 0.13.0-1ubuntu 2.1), python-novaclient:amd64 (2.17.0-0ubuntu1.1, 2.17.0-0ubuntu1.2), grub-pc-bin:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), grub-pc:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), nova-compute:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), openvswitch-common:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2) End-Date: 2015-05-28 14:24:47 From /var/log/neutron/openvswitch-agent.log: 2015-05-28 14:24:18.336 47866 ERROR neutron.agent.linux.ovsdb_monitor [-] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) Looking at a stuck instances, all the right tunnels and bridges and what not appear to be there: root@vector:~# ip l l | grep c-3b 460002: qbr7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 460003: qvo7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000 460004: qvb7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UP mode DEFAULT group default qlen 1000 460005: tap7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UNKNOWN mode DEFAULT group default qlen 500 root@vector:~# ovs-vsctl list-ports br-int | grep c-3b qvo7ed8b59c-3b root@vector:~# But I can't ping the unit from within the qrouter-${id} namespace on the neutron gateway. If I tcpdump the {q,t}*c-3b interfaces, I don't see any traffic. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1460164/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp