Public bug reported: Description =========
When Galera is used in multi-writer mode it's possible that instance_info_cache_update() DB API method will be called for the very same database row concurrently on two different MySQL servers. Due to how Galera works internally, it will cause a deadlock exception for one of the callers (see http://www.joinfu.com/2015/01/understanding- reservations-concurrency-locking-in-nova/ for details). instance_info_cache_update() is not currently retried on deadlock. Should it happen an operation in question may fail, e.g. association of a floating IP. Steps to reproduce =============== 1. Deploy Galera cluster in multi-writer mode. 2. Ensure there is at least two nova-conductor using two different MySQL servers in the Galera cluster. 3. Create an instance. 4. Associate / disassociate floating IPs concurrently (e.g. via Rally) Expected result ============= All associate / disassociate operations succeed. Actual result ========== One or more operations fail with an exception in python-novaclient: File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 662, in remove_floating_ip self._action('removeFloatingIp', server, {'address': address}) File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 1279, in _action return self.api.client.post(url, body=body) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 449, in post return self._cs_request(url, 'POST', **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 424, in _cs_request resp, body = self._time_request(url, method, **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 397, in _time_request resp, body = self.request(url, method, **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 391, in request raise exceptions.from_response(resp, body, url, method) ClientException: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. <class 'oslo_db.exception.DBDeadlock'> (HTTP 500) (Request-ID: req-ac412e1c-afcf-4ef3-accc-b5463805ca74) Environment ========== OpenStack Liberty Galera cluster (3 nodes) running in multiwriter mode ** Affects: nova Importance: Medium Assignee: Roman Podoliaka (rpodolyaka) Status: In Progress ** Tags: db -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1567336 Title: instance_info_cache_update() is not retried on deadlock Status in OpenStack Compute (nova): In Progress Bug description: Description ========= When Galera is used in multi-writer mode it's possible that instance_info_cache_update() DB API method will be called for the very same database row concurrently on two different MySQL servers. Due to how Galera works internally, it will cause a deadlock exception for one of the callers (see http://www.joinfu.com/2015/01 /understanding-reservations-concurrency-locking-in-nova/ for details). instance_info_cache_update() is not currently retried on deadlock. Should it happen an operation in question may fail, e.g. association of a floating IP. Steps to reproduce =============== 1. Deploy Galera cluster in multi-writer mode. 2. Ensure there is at least two nova-conductor using two different MySQL servers in the Galera cluster. 3. Create an instance. 4. Associate / disassociate floating IPs concurrently (e.g. via Rally) Expected result ============= All associate / disassociate operations succeed. Actual result ========== One or more operations fail with an exception in python-novaclient: File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 662, in remove_floating_ip self._action('removeFloatingIp', server, {'address': address}) File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 1279, in _action return self.api.client.post(url, body=body) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 449, in post return self._cs_request(url, 'POST', **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 424, in _cs_request resp, body = self._time_request(url, method, **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 397, in _time_request resp, body = self.request(url, method, **kwargs) File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 391, in request raise exceptions.from_response(resp, body, url, method) ClientException: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. <class 'oslo_db.exception.DBDeadlock'> (HTTP 500) (Request-ID: req-ac412e1c-afcf-4ef3-accc-b5463805ca74) Environment ========== OpenStack Liberty Galera cluster (3 nodes) running in multiwriter mode To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1567336/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp