Our rolling update APIs can be quite inconvenient to work with when it
comes to instance scaling [1]. It's especially frustrating when
adding/removing instances has to be done in an automated fashion (e.g.: by
an external autoscaling process) as it requires holding on to the original
aurora config at all times.

I propose we add simple instance scaling APIs to address the above. Since
Aurora job may have instances at different configs at any moment, I propose
we accept an InstanceKey as a reference point when scaling out. For example:

    /** Scales out a given job by adding more instances with the task
config of the templateKey. */
    Response scaleOut(1: InstanceKey templateKey, 2: i32 incrementCount)

    /** Scales in a given job by removing existing instances. */
    Response scaleIn(1: JobKey job, 2: i32 decrementCount)

A correspondent client command could then look like:

    aurora job scale-out devcluster/vagrant/test/hello/1 10

For the above command, a scheduler would take task config of instance 1 of
the 'hello' job and replicate it 10 more times thus adding 10 additional
instances to the job.

There are, of course, some details to work out like making sure no active
update is in flight, scale out does not violate quota and etc. I intend to
address those during the implementation as things progress.

Does the above make sense? Any concerns/suggestions?

Thanks,
Maxim

[1] - https://issues.apache.org/jira/browse/AURORA-1258

Reply via email to