Re: [DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

Yu Li Thu, 30 May 2019 01:32:42 -0700

Thanks all for the feedbacks here. And thanks for the reminder about the
plan of reworking abstractions of snapshot strategies @Gordon, will
definitely watch it and make sure of the collaboration of the two works.


Since this discussion thread has been open for some time and we all get a
consensus, will close it now and start creating JIRAs. On the other hand,
discussion in design doc will continue for sure and just let me know if you
have any suggestions there or in the coming JIRAs.

Thanks again for all participants in the discussion!

Best Regards,
Yu


On Thu, 30 May 2019 at 15:45, Tzu-Li (Gordon) Tai <[email protected]>
wrote:

> Hi Yu,
>
> +1 to move forward with efforts to add this feature.
>
> As mentioned in the document as well as some offline discussions, from my
> side the only comments I have are related to how we snapshot the off-heap
> key groups.
> I think a recent discussion I posted about savepoint format unification for
> keyed state as well as reworking abstractions for snapshot strategies [1]
> will be relevant here.
>
> Cheers,
> Gordon
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-41-Unify-Keyed-State-Snapshot-Binary-Format-for-Savepoints-td29197.html
>
> On Wed, May 29, 2019 at 5:08 PM Yuzhao Chen <[email protected]> wrote:
>
> > +1, thanks for you nice work, Yu Li !
> >
> > Best,
> > Danny Chan
> > 在 2019年5月24日 +0800 PM8:51，Yu Li <[email protected]>，写道：
> > > Hi All,
> > >
> > > As mentioned in our speak[1] given in FlinkForwardChina2018, we have
> > > improved HeapKeyedStateBackend to support disk spilling and put it in
> > > production here in Alibaba for last year's Singles' Day. Now we're
> ready
> > to
> > > upstream our work and the design doc is up for review[2]. Please let us
> > > know your point of the feature and any comment is welcomed/appreciated.
> > >
> > > We plan to keep the discussion open for at least 72 hours, and will
> > create
> > > umbrella jira and subtasks if no objections. Thanks.
> > >
> > > Below is a brief description about the motivation of the work, FYI:
> > >
> > >
> > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> > since
> > > state lives as Java objects on the heap in HeapKeyedStateBackend and
> the
> > > de/serialization only happens during state snapshot and restore, it
> > > outperforms RocksDBKeyeStateBackend when all data could reside in
> > > memory.**However,
> > > along with the advantage, HeapKeyedStateBackend also has its
> > shortcomings,
> > > and the most painful one is the difficulty to estimate the maximum heap
> > > size (Xmx) to set, and we will suffer from GC impact once the heap
> memory
> > > is not enough to hold all state data. There’re several (inevitable)
> > causes
> > > for such scenario, including (but not limited to):*
> > >
> > >
> > >
> > > ** Memory overhead of Java object representation (tens of times of the
> > > serialized data size).* Data flood caused by burst traffic.* Data
> > > accumulation caused by source malfunction.**To resolve this problem, we
> > > proposed a solution to support spilling state data to disk before heap
> > > memory is exhausted. We will monitor the heap usage and choose the
> > coldest
> > > data to spill, and reload them when heap memory is regained after data
> > > removing or TTL expiration, automatically.*
> > >
> > > [1]
> > https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf
> > > [2]
> > >
> >
> https://docs.google.com/document/d/1rtWQjIQ-tYWt0lTkZYdqTM6gQUleV8AXrfTOyWUZMf4/edit?usp=sharing
> > >
> > > Best Regards,
> > > Yu
> >
>

Re: [DISCUSS] Proposal to support disk spilling in HeapKeyedStateBackend

Reply via email to