Reviewed: https://review.openstack.org/257059 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9c3c19f07ce52e139d431aec54341c38a183f0b7 Submitter: Jenkins Branch: master
commit 9c3c19f07ce52e139d431aec54341c38a183f0b7 Author: Kevin Benton <ke...@benton.pub> Date: Thu Feb 18 03:48:29 2016 -0800 Add ALLOCATING state to routers This patch adds a new ALLOCATING status to routers to indicate that the routers are still being built on the Neutron server. Any routers in this state are excluded in router retrievals by the L3 agent since they are not yet ready to be wired up. This is necessary when a router is made up of several distinct Neutron resources that cannot all be put into a single transaction. This patch applies this new state to HA routers while their internal HA ports and networks are being created/deleted so the L3 HA agent will never retrieve a partially formed HA router. It's important to note that the ALLOCATING status carries over until after the scheduling is done, which ensures that routers that weren't fully scheduled will not be sent to the agents. An HA router is placed in this state only when it is being created or converted to/from the HA state since this is disruptive to the dataplane. This patch also reverts the changes introduced in Iadb5a69d4cbc2515fb112867c525676cadea002b since they will be handled by the ALLOCATING logic instead. Co-Authored-By: Ann Kamyshnikova <akamyshnik...@mirantis.com> Co-Authored-By: John Schwarz <jschw...@redhat.com> APIImpact Closes-Bug: #1550886 Related-bug: #1499647 Change-Id: I22ff5a5a74527366da8f82982232d4e70e455570 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1550886 Title: L3 Agent's fullsync is raceful with creation of HA router Status in neutron: Fix Released Bug description: When creating an HA router, after the server creates all the DB objects (including the HA network and ports if it's the first one), the server continues on the schedule the router to (some of) the available agents. The race is achieved when an L3 agent router issues a sync_router request, which later down the line ends up in an auto_schedule_routers() call. If this happens before the above schedule (of the create_router()) is complete, the server will refuse to schedule the router to the other intended L3 agents, resulting is less agents being scheduled. The only way to fix this is either restarting one of the L3 agents which didn't get scheduled, or recreating the router. Either is a bad option. An example of the state: $ neutron l3-agent-list-hosting-router router2 +--------------------------------------+-------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+-------------------------+----------------+-------+----------+ | d05da32b-34e7-4c7f-b0dd-938328a0c0ed | vpn-6-12 | True | :-) | active | +--------------------------------------+-------------------------+----------------+-------+----------+ (only 1 of the agent got scheduled with the router, even though there are 3 suitable agents that normally get scheduled without the race.) To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1550886/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp