Maxim Khutornenko created AURORA-1549: -----------------------------------------
Summary: Updater kills instances with scoped update Key: AURORA-1549 URL: https://issues.apache.org/jira/browse/AURORA-1549 Project: Aurora Issue Type: Bug Components: Scheduler Reporter: Maxim Khutornenko Assignee: Maxim Khutornenko Consider the following sequence for the hello_world job with 3 instances: {noformat} aurora job create devcluster/www-data/prod/hello aurora/examples/jobs/hello_world.aurora <change config to trigger update, e.g. change RAM> aurora update start devcluster/www-data/prod/hello/0 aurora/examples/jobs/hello_world.aurora aurora job kill devcluster/www-data/prod/hello/1 aurora update start devcluster/www-data/prod/hello/0,1 aurora/examples/jobs/hello_world.aurora {noformat} The expectation is to have all 3 instances on the same config. The result: instance 0 is killed with only instances 1 and 2 remaining. The problem is that [UpdateFactory|https://github.com/apache/aurora/blob/33d7e2170a86f54722a02a2dc9cb1e09fb52df25/src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java#L95-L101] iterates over scoped instances thus overriding the JobDiff results. This leads to [InstanceUpdater|https://github.com/apache/aurora/blob/d7a1619fa85195937e74d1b09594909f0ed0ffd5/src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java#L102-L107] killing any instances that are present in actual state but not present in the desired state. These are the (correct) results produced by the [JobDiff|https://github.com/apache/aurora/blob/2e2371481d9aaccd6a45ad0f442d963d5ae7a3c8/src/main/java/org/apache/aurora/scheduler/updater/JobDiff.java#L185-L202] that should be used to drive the update instead: {noformat} "Unscoped diff contents:" Replaced: [2] Replacements: [1, 2] Unchanged: [0] "Scoped (final) diff contents:" Replaced: [] Replacements: [1] Unchanged: [2, 0] {noformat} The current behavior appears to be a leftover that should have been removed in this refactoring: https://reviews.apache.org/r/25969/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)