** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New
** Also affects: nova/stein Importance: Undecided Status: New ** Changed in: nova/queens Status: New => In Progress ** Changed in: nova/stein Status: New => Fix Released ** Changed in: nova/queens Importance: Undecided => Low ** Changed in: nova/rocky Importance: Undecided => Low ** Changed in: nova/queens Assignee: (unassigned) => Matthew Booth (mbooth-9) ** Changed in: nova/stein Importance: Undecided => Low ** Changed in: nova/stein Assignee: (unassigned) => Matthew Booth (mbooth-9) ** Changed in: nova/rocky Assignee: (unassigned) => Matthew Booth (mbooth-9) ** Changed in: nova/rocky Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1821373 Title: Most instance actions can be called concurrently Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: Fix Released Status in OpenStack Compute (nova) stein series: Fix Released Bug description: A customer reported that they were getting DB corruption if they called shelve twice in quick succession on the same instance. This should be prevented by the guard in nova.API.shelve, which does: instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) This is intended to act as a robust gate against 2 instance actions happening concurrently. The first will set the task state to SHELVING, the second will fail because the task state is not SHELVING. The comparison is done atomically in db.instance_update_and_get_original(), and should be race free. However, instance.save() shortcuts if there is no update and does not call db.instance_update_and_get_original(). Therefore this guard fails if we call the same operation twice: instance = get_instance() => Returned instance.task_state is None instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) => task_state was None, now SHELVING, updates = {'task_state': SHELVING} => db.instance_update_and_get_original() executes and succeeds instance = get_instance() => Returned instance.task_state is SHELVING instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) => task_state was SHELVING, still SHELVING, updates = {} => db.instance_update_and_get_original() does not execute, therefore doesn't raise the expected exception This pattern is common to almost all instance actions in nova api. A quick scan suggests that all of the following actions are affected by this bug, and can therefore all potentially be executed multiple times concurrently for the same instance: restore force_stop start backup snapshot soft reboot hard reboot rebuild revert_resize resize shelve shelve_offload unshelve pause unpause suspend resume rescue unrescue set_admin_password live_migrate evacuate To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1821373/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp