I like it a lot!
I think it makes sense to clean this up despite the planned new
fault-tolerance mechanisms. In the future, users will decide which
mechanism to use and I can imagine that a lot of them will keep using
the current mechanism for quite a while to come. But I'm happy to yield
to Stephan's opinion here, he knows more about the progress of that work.
The one nitpick I have is about naming: will users understand
OnHeapStateBackend? I mean, do they know what on-heap/off-heap memory is
and the tradeoffs? An alternative could be HashMapStateBackend, because
that's essentially what it is. I wouldn't block anything on this, though.
Aljoscha
On 09.09.20 10:05, Konstantin Knauf wrote:
Thanks for the initiative. Big +1. Would be interested to hear if the
proposed interfaces still make sense in the face of the new fault-tolerance
work that is planned. Stephan/Piotr will know.
On Tue, Sep 8, 2020 at 7:05 PM Seth Wiesman <sjwies...@gmail.com> wrote:
Hi Devs,
I'd like to propose an update to how state backends and checkpoint storage
are configured to help users better understand Flink.
Apache Flink's durability story is a mystery to many users. One of the most
common recurring questions from users comes from not understanding the
relationship between state, state backends, and snapshots. Some of this
confusion can be abated with learning material but the question is so
pervasive that we believe Flinkās user APIs should be better communicate
what different components are responsible for.
https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing
I look forward to a healthy discussion.
Seth