Public bug reported: In looking through the retry mechanism for pluggable IPAM (e.g. [1]), I found it is not robust. It catches only a very narrow set of errors. Many other errors would not result in a rollback notification to the external IPAM system. Basically, if anything else fails during a port create and causes the DB transaction to be rolled back, the IP allocations will be forgotten by Neutron but an external IPAM will still remember them. No notification will be sent to the external system to reverse what it had done.
There are a couple of options we could pursue. One is a decorator on the API operation which would take care to call rollback if anything went wrong. The other is to use an sqlalchemy level hook, after_transaction_end, to detect DB rollback and call IPAM rollback. In both cases, the problem is where/how to do the book-keeping. We need to immediately record successful (de)allocations from the external IPAM system somewhere where that will be available in the event rollback is needed. One ideas is to piggy-back off of the context in session.info or somewhere like that. This discussion in IRC [2] might be useful. [1] https://github.com/openstack/neutron/blob/949aae6a8b92a77a06d04734bf82ed7a917057a7/neutron/db/ipam_pluggable_backend.py#L129-L136 [2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-08-03.log.html#t2016-08-03T18:08:58 ** Affects: neutron Importance: High Status: Confirmed ** Tags: l3-ipam-dhcp ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => High ** Tags added: l3-ipam-dhcp -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1610483 Title: Pluggable IPAM rollback mechanism is not robust Status in neutron: Confirmed Bug description: In looking through the retry mechanism for pluggable IPAM (e.g. [1]), I found it is not robust. It catches only a very narrow set of errors. Many other errors would not result in a rollback notification to the external IPAM system. Basically, if anything else fails during a port create and causes the DB transaction to be rolled back, the IP allocations will be forgotten by Neutron but an external IPAM will still remember them. No notification will be sent to the external system to reverse what it had done. There are a couple of options we could pursue. One is a decorator on the API operation which would take care to call rollback if anything went wrong. The other is to use an sqlalchemy level hook, after_transaction_end, to detect DB rollback and call IPAM rollback. In both cases, the problem is where/how to do the book-keeping. We need to immediately record successful (de)allocations from the external IPAM system somewhere where that will be available in the event rollback is needed. One ideas is to piggy-back off of the context in session.info or somewhere like that. This discussion in IRC [2] might be useful. [1] https://github.com/openstack/neutron/blob/949aae6a8b92a77a06d04734bf82ed7a917057a7/neutron/db/ipam_pluggable_backend.py#L129-L136 [2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-08-03.log.html#t2016-08-03T18:08:58 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1610483/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp