Hello,

I'm trying to find a solution for auto scaling our Flink EMR cluster with 0
downtime using RocksDB as state storage and S3 backing store.

My current thoughts are like so:
* Scaling an Operator dynamically would require all keyed state to be
available to the set of subtasks for that operator, therefore a set of
subtasks must be reading to and writing from the same RocksDB. I.e. to
scale in and out subtasks in that set, they need to read from the same
Rocks.
* Since subtasks can run on different core nodes, is it possible to have
different core nodes read/write to the same RocksDB?
* When's the safe point to scale in and out an operator? Only right after a
checkpoint possibly?

If the above is not possible then we'll have to use save points which means
some downtime, therefore:
* Scaling out during high traffic is arguably more important to react
quickly to than scaling in during low traffic. Is it possible to add more
core nodes to EMR without disturbing a job? If so then maybe we can
orchestrate running a new job on new nodes and then loading a savepoint
from a currently running job.

Lastly
* Save Points for ~70Gib of data take on the order of minutes to tens of
minutes for us to restore from, is there any way to speed up restoration?

Thanks!

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Reply via email to