[ https://issues.apache.org/jira/browse/AURORA-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174119#comment-16174119 ]
Bill Farner commented on AURORA-1948: ------------------------------------- A few things come to mind for me. Brain dump here: - constraints are stored in `TaskConfig`, which is sent to the executor and saved. ideally, the executor would never receive this information since it is irrelevant. however, it is a potential source of confusion if the scheduler independently changes the value - this breaks into new territory by having the scheduler determine specific fields in a `TaskConfig` that do not result in an instance reboot. in this case, it should be trivially accomplished by detecting that a change is isolated to the constraints, and re-running the constraint matcher against existing instances. > Adding instances leads to constraints conflict > ---------------------------------------------- > > Key: AURORA-1948 > URL: https://issues.apache.org/jira/browse/AURORA-1948 > Project: Aurora > Issue Type: Story > Components: Scheduler > Reporter: Vladimir > Priority: Minor > > Problem: > When scaling instances (adding more instances) there could be a constraint > conflict. > Example: > Let's say you have a mesos cluster with 3 racks. You want to deploy a service > and create aurora job with the "rack" constraint "limit" set to 1. So > basically it means that no more than 1 instance per rack. The job has number > of instances set to 2, for example. The deployment will succeed and user > would get 2 instances running on 2 different racks. > Next user would like to scale it to 4 instances by adding 2 more instances. > In this case if user won't update the rack constraint (set limit to 2 or > larger), the update job would fail showing "Limit not satisfied: rack". If > user would modify the constraints, then the regular Job update would be > involved which would add new instances but also update the existing ones > (rolling deploy). > Proposal: > The proposal is to be able to update the constraints while adding new > instances in order to satisfy the limits and be sure that the currently > running instances won't be redeployed. > So in my example, when scaling up we can recalculate the rack limit on our > end, update it's value in the job config, update number of instances and > start job update. Aurora would leave 2 currently running instances as they > are and only add two new instances making sure that the new constraints are > satisfied.. -- This message was sent by Atlassian JIRA (v6.4.14#64029)