Public bug reported: 2 functions used in error cleanup in _do_build_and_run_instance: _cleanup_allocated_networks and _set_instance_obj_error_state, call an unguarded instance.save(). The problem with this is that the instance object may have been in an unclean state before the build exception was raised. Calling instance.save() will persist this unclean error state in addition to whatever change was made during cleanup, which is not intended.
Specifically in the case that a build races with a delete, the build can fail when we try to do an atomic save to set the vm_state to active, raising UnexpectedDeletingTaskStateError. However, the instance object still contains the unpersisted vm_state change along with other concomitant changes. These will all be persisted when _cleanup_allocated_networks calls instance.save(). This means that the instance.save(expected_task_state=SPAWNING) which correctly failed due to a race, later succeeds accidentally in cleanup resulting in an inconsistent instance state. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1848666 Title: Race can cause instance to become ACTIVE after build error Status in OpenStack Compute (nova): New Bug description: 2 functions used in error cleanup in _do_build_and_run_instance: _cleanup_allocated_networks and _set_instance_obj_error_state, call an unguarded instance.save(). The problem with this is that the instance object may have been in an unclean state before the build exception was raised. Calling instance.save() will persist this unclean error state in addition to whatever change was made during cleanup, which is not intended. Specifically in the case that a build races with a delete, the build can fail when we try to do an atomic save to set the vm_state to active, raising UnexpectedDeletingTaskStateError. However, the instance object still contains the unpersisted vm_state change along with other concomitant changes. These will all be persisted when _cleanup_allocated_networks calls instance.save(). This means that the instance.save(expected_task_state=SPAWNING) which correctly failed due to a race, later succeeds accidentally in cleanup resulting in an inconsistent instance state. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1848666/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp