On 06/20/2018 10:00 AM, Sylvain Bauza wrote:
When we reviewed the spec, we agreed as a community to say that we should still get race conditions once the series is implemented, but at least it helps operators. Quoting : "It would also be possible for another instance to steal NUMA resources from a live migrated instance before the latter’s destination compute host has a chance to claim them. Until NUMA resource providers are implemented [3] <https://review.openstack.org/#/c/552924/> and allow for an essentially atomic schedule+claim operation, scheduling and claiming will keep being done at different times on different nodes. Thus, the potential for races will continue to exist." https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/numa-aware-live-migration.html#proposed-change
My understanding of that quote was that we were acknowledging the fact that when using the ResourceTracker there was an unavoidable race window between the time when the scheduler selected a compute node and when the resources were claimed on that compute node in check_can_live_migrate_destination(). And in this model no resources are actually *used* until they are claimed.
As I understand it, Artom is proposing to have a larger race window, essentially from when the scheduler selects a node until the resource audit runs on that node.
Chris __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
