Hi!

This is an interesting suggestion.
Just to make sure I understand it correctly: Do you design this for cases
where the state per machine is larger than that machines memory/disk? And
in that case, you cannot solve the problem by scaling out (having more
machines)?

Stephan


On Tue, Jan 17, 2017 at 6:29 AM, Chen Qin <c...@uber.com> wrote:

> Hi there,
>
> I would like to discuss split over local states to external storage. The
> use case is NOT another external state backend like HDFS, rather just to
> expand beyond what local disk/ memory can hold when large key space exceeds
> what task managers could handle. Realizing FLINK-4266 might be hard to
> tacking all-in-one, I would live give a shot to split-over first.
>
> An intuitive approach would be treat HeapStatebackend as LRU cache and
> split over to external key/value storage when threshold triggered. To make
> this happen, we need minor refactor to runtime and adding set/get logic.
> One nice thing of keeping HDFS to store snapshots would be avoid versioning
> conflicts. Once checkpoint restore happens, partial write data will be
> overwritten with previously checkpointed value.
>
> Comments?
>
> --
> -Chen Qin
>

Reply via email to