Hi Stephan,

Regarding backward compatibility, I agree and the intention is that all
existing code will continue to function with the same semantics. My working
idea is to remove the two checkpoint-storage related methods from
StateBackend into a new SnapshotStorage interface but then have
AbstractFileStateBackend and RocksDBStateBackend implement snapshot
storage. If a state backend implements SnapshotStorage it will be used
unconditionally, even if a different snapshot storage implementation is
configured. This way we don't change any of the concrete classes that users
interact with. The only people who would see breaking changes are state
backend implementors and they only need to add `implements SnapshotStorage`
to their class.

The reason I went with SnapshotStorage is there is already an interface
`org.apache.flink.runtime.state.CheckpointStorage` in flink-runtime. If we
can rename this interface to something else I'm happy to take the name, but
if not I think it will lead to import confusion.

Seth

On Wed, Sep 16, 2020 at 11:54 AM Stephan Ewen <se...@apache.org> wrote:

> @Yun and @Congxian:
>
> I think "async", "incremental", and similar flags belong very much with the
> state backend (the index structure).
> They define how the snapshotting procedure behaves.
>
> The SnapshotStorage is really just about storage of checkpoint streams
> (bytes) and handles and pointers.
>
> Best,
> Stephan
>
>
> On Wed, Sep 16, 2020 at 6:48 PM Stephan Ewen <se...@apache.org> wrote:
>
> > Thanks for the great suggestion and the great discussion. Generally big
> +1
> > to this effort.
> > Some thoughts from my side:
> >
> > *## Backwards Compatibility*
> >
> > I think we should really strive to make this non breaking. Maybe we have
> > new classes / interfaces for StateBackends and CheckpointStorage and let
> > the existing State Backend classes implement both (and deprecate them)?
> >
> > In the past, I have gotten some harsh comments from users about breaking
> > long-time effectively stable APIs, so let's try hard to avoid this
> (unless
> > it makes things impossible).
> >
> > *## Naming*
> >
> > HashMapStateBackend sounds good to me
> >
> > Could we rename the SnapshotStorage to CheckpointStorage? Or converge all
> > methods around "Snapshot"?
> > I think we already have some confusion from mixing the terms checkpoint
> > and snapshot and should converge in either direction.
> > I am slightly leaning towards converging around checkpoints, because
> > that's the most commonly known term among users as far as I can tell.
> > Checkpoints are Snapshots. But one could also just call them Checkpoints
> > and let Savepoints be special Checkpoints.
> >
> > *## Integrated State / Storage Backends*
> >
> > There is an idea floating around now and then about a Cassandra backend
> > (or other K/V store) where the state index and durable location are
> tightly
> > intertwined.
> > However, I think this would not contradict, because it might just mean
> > that the checkpoint storage is used less (maybe only for savepoints, or
> for
> > WALs).
> >
> > *## Future Fault Tolerance Ideas*
> >
> > I think this conflicts with none of the future fault tolerance ideas I am
> > involved with.
> > Similar to the above, there is always some checkpoint storage involved,
> > for example for savepoints or for backup/consolidation, so no problem.
> >
> >
> > Best,
> > Stephan
> >
> > On Wed, Sep 16, 2020 at 5:11 PM Aljoscha Krettek <aljos...@apache.org>
> > wrote:
> >
> >> I think the mentioned settings should be in the state backend. They
> >> configure how a certain backend writes to a snapshot storage, but it’s
> >> still the backend that has the logic and decides.
> >>
> >> I think it's a good point, though, to be conscious about those settings.
> >> I'm sure we can figure out the details during implementation, though.
> >>
> >> Best,
> >> Aljoscha
> >>
> >> On 16.09.20 16:54, Seth Wiesman wrote:
> >> > Hi Congxian,
> >> >
> >> > There is an allusion to those configs in the wiki but let me better
> >> spell
> >> > out my thinking. The flink-conf configurations will not change and I
> >> > believe the java code switches should remain on the state backend
> >> objects.
> >> >
> >> > We are of course not fully disentangling state backends from snapshots
> >> and
> >> > these configurations affect how your state backend runs in
> production. I
> >> > believe users would find it strange to have configurations like
> >> > `state.backend.rocksdb.checkpoint.transfer.thred.num` not be part of
> the
> >> > EmbeddedRocksdbStateBackend but somewhere else. This then leads to the
> >> > question, is it better to split configurations between multiple places
> >> or
> >> > not. Users appreciate consistency, and so having all the
> configurations
> >> on
> >> > the state backend objects makes them more discoverable and your
> >> application
> >> > easier to reason about.
> >> >
> >> > Additionally, I view these as advanced configurations. My hope is most
> >> > users can simply use the no-arg constructor for a state backend in
> >> > production. If a user is changing the number of rocksdb transfer
> >> threads or
> >> > disabling async checkpoints, they likely know what they are doing.
> >> >
> >> > Please let me know if you have any concerns or would like to cancel
> the
> >> > vote.
> >> >
> >> > Seth
> >> >
> >> > On Wed, Sep 16, 2020 at 12:37 AM Congxian Qiu <qcx978132...@gmail.com
> >
> >> > wrote:
> >> >
> >> >> Sorry for jump late in.
> >> >>
> >> >> I like the separation here, this separation makes more user friendly
> >> now.
> >> >>
> >> >> I just wonder how the configuration such as
> >> 'state.backend.incremental',
> >> >> 'state.backend.async' and
> >> >> `state.backend.rocksdb.checkpoint.transfer.thred.num` will be
> >> configured
> >> >> after the separation, I think these configurations are more related
> to
> >> >> snapshots (maybe a little strange to configure these on statebackend
> >> side).
> >> >> did not see this on the FLIP wiki currently.
> >> >>
> >> >> Best,
> >> >> Congxian
> >> >>
> >> >>
> >> >> Seth Wiesman <sjwies...@gmail.com> 于2020年9月15日周二 下午9:51写道:
> >> >>
> >> >>> Sounds good to me. I'll update the FLIP.
> >> >>>
> >> >>> On Tue, Sep 15, 2020 at 8:35 AM Dawid Wysakowicz <
> >> dwysakow...@apache.org
> >> >>>
> >> >>> wrote:
> >> >>>
> >> >>>> There is a good number of precedents that introduced backwards
> >> >>>> incompatible changes to that interface (which is PublicEvolving
> btw).
> >> >> We
> >> >>>> introduced a couple of additional arguments to the
> >> >>>> createKeyedStateBackend method and later on removed the methods
> with
> >> >>>> default implementation for backwards compatibility. I want to
> >> introduce
> >> >>>> a backward incompatible change in FLIP-140 (replace the
> >> >>>> AbstractKeyedStateBackend with an interface). From my perspective
> we
> >> >>>> should just do these changes. The impact should be minimal.
> >> >>>>
> >> >>>> Best,
> >> >>>>
> >> >>>> Dawid
> >> >>>>
> >> >>>>
> >> >>>> On 15/09/2020 15:20, Seth Wiesman wrote:
> >> >>>>> Hey Dawid,
> >> >>>>>
> >> >>>>> I didn't want to break compatibility but if there is precedent and
> >> >>>> everyone
> >> >>>>> is ok with it then I'm +1.
> >> >>>>>
> >> >>>>> Seth
> >> >>>>>
> >> >>>>> On Tue, Sep 15, 2020 at 2:22 AM Dawid Wysakowicz <
> >> >>> dwysakow...@apache.org
> >> >>>>>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>> Sorry for joining so late.
> >> >>>>>>
> >> >>>>>> Generally speaking I like this idea very much!
> >> >>>>>>
> >> >>>>>> I have one idea about the StateBackend interface. Could we
> instead
> >> >> of
> >> >>>>>> adding a flag method boolean isLegacyStateBackend remove the
> >> >>>>>> checkpointstorage related methods from StateBackend right away?
> The
> >> >>>>>> old/legacy implementations could then implement both StateBackend
> >> >> and
> >> >>>>>> SnapshotStorage. In turn in the method env.setStateBackend we
> could
> >> >>> do:
> >> >>>>>>
> >> >>>>>> setStateBackend(StateBackend backend) {
> >> >>>>>>
> >> >>>>>>      this.stateBackend = backend;
> >> >>>>>>
> >> >>>>>>      if (backend instanceof SnapshotStorage) {
> >> >>>>>>
> >> >>>>>>           this.setSnapshotStorage(backend);
> >> >>>>>>
> >> >>>>>>      }
> >> >>>>>>
> >> >>>>>> }
> >> >>>>>>
> >> >>>>>> This has the benefit that we could already get rid off the
> methods
> >> >>> from
> >> >>>>>> StateBackend which would be problematic in the new
> implementations
> >> >>> (such
> >> >>>>>> as e.g. HashMapStateBackend - what would you return there?
> null?).
> >> I
> >> >>>>>> know this would break the interface, but StateBackend is actually
> >> >>> quite
> >> >>>>>> internal, we did it quite freely in the past, and I don't think
> >> >> there
> >> >>>>>> are many custom state implementation in the wild. And even if
> there
> >> >>> are
> >> >>>>>> some the workaround is as easy as simply adding implements
> >> >>>> SnapshotStorage.
> >> >>>>>>
> >> >>>>>> Best,
> >> >>>>>>
> >> >>>>>> Dawid
> >> >>>>>>
> >> >>>>>> On 11/09/2020 16:48, Aljoscha Krettek wrote:
> >> >>>>>>> I could try and come up with a longer name if you need it ...
> ;-)
> >> >>>>>>>
> >> >>>>>>> Aljoscha
> >> >>>>>>>
> >> >>>>>>> On 11.09.20 16:25, Seth Wiesman wrote:
> >> >>>>>>>> Having thought about it more, HashMapStateBackend has won me
> >> over.
> >> >>>> I'll
> >> >>>>>>>> update the FLIP. If there aren't any more comments I'll open it
> >> up
> >> >>> for
> >> >>>>>>>> voting on monday.
> >> >>>>>>>>
> >> >>>>>>>> Seth
> >> >>>>>>>>
> >> >>>>>>>> On Wed, Sep 9, 2020 at 9:09 AM Seth Wiesman <
> sjwies...@gmail.com
> >> >
> >> >>>>>> wrote:
> >> >>>>>>>>> @Yun yes, this is really about making CheckpointStorage an
> >> >>> orthogonal
> >> >>>>>>>>> concept. I think we can remain pragmatic and keep
> state-backend
> >> >>>>>>>>> specific
> >> >>>>>>>>> configurations (async, incremental, etc) in the state backend
> >> >>>>>>>>> themselves. I
> >> >>>>>>>>> view these as more advanced configurations and by the time
> >> >> someone
> >> >>> is
> >> >>>>>>>>> changing the defaults they likely understand what is going on.
> >> My
> >> >>>>>>>>> goal is
> >> >>>>>>>>> to help on-board users and so long as each state backend has a
> >> >>> no-arg
> >> >>>>>>>>> default constructor that works for many users I think we've
> >> >>> achieved
> >> >>>>>>>>> that
> >> >>>>>>>>> goal.
> >> >>>>>>>>>
> >> >>>>>>>>> Regarding the checkpoint coordinator, that makes sense but I
> >> will
> >> >>>>>>>>> consider
> >> >>>>>>>>> out of the scope of this FLIP. I want to focus on simplifying
> >> >> APIs.
> >> >>>>>>>>>
> >> >>>>>>>>> @Aljoscha Krettek <aljos...@apache.org>
> >> >>>>>>>>>
> >> >>>>>>>>> My feeling is that state backends and checkpointing are going
> to
> >> >> be
> >> >>>>>>>>> integral to Flink for many years, regardless or other
> >> >> enhancements
> >> >>>>>>>>> so this
> >> >>>>>>>>> change is still valuable.
> >> >>>>>>>>>
> >> >>>>>>>>> Since this is a FLIP about improving the user api I'm happy to
> >> >>>> bikeshed
> >> >>>>>>>>> the names a little more than normal. HashMap makes sense, my
> >> >> other
> >> >>>>>>>>> thought
> >> >>>>>>>>> was InMemory.
> >> >>>>>>>>>
> >> >>>>>>>>> Seth
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Wed, Sep 9, 2020 at 8:04 AM Aljoscha Krettek <
> >> >>> aljos...@apache.org
> >> >>>>>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> I like it a lot!
> >> >>>>>>>>>>
> >> >>>>>>>>>> I think it makes sense to clean this up despite the planned
> new
> >> >>>>>>>>>> fault-tolerance mechanisms. In the future, users will decide
> >> >> which
> >> >>>>>>>>>> mechanism to use and I can imagine that a lot of them will
> keep
> >> >>>> using
> >> >>>>>>>>>> the current mechanism for quite a while to come. But I'm
> happy
> >> >> to
> >> >>>>>>>>>> yield
> >> >>>>>>>>>> to Stephan's opinion here, he knows more about the progress
> of
> >> >>> that
> >> >>>>>>>>>> work.
> >> >>>>>>>>>>
> >> >>>>>>>>>> The one nitpick I have is about naming: will users understand
> >> >>>>>>>>>> OnHeapStateBackend? I mean, do they know what
> on-heap/off-heap
> >> >>>>>>>>>> memory is
> >> >>>>>>>>>> and the tradeoffs? An alternative could be
> HashMapStateBackend,
> >> >>>>>>>>>> because
> >> >>>>>>>>>> that's essentially what it is. I wouldn't block anything on
> >> >> this,
> >> >>>>>>>>>> though.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Aljoscha
> >> >>>>>>>>>>
> >> >>>>>>>>>> On 09.09.20 10:05, Konstantin Knauf wrote:
> >> >>>>>>>>>>> Thanks for the initiative. Big +1. Would be interested to
> hear
> >> >> if
> >> >>>> the
> >> >>>>>>>>>>> proposed interfaces still make sense in the face of the new
> >> >>>>>>>>>> fault-tolerance
> >> >>>>>>>>>>> work that is planned. Stephan/Piotr will know.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On Tue, Sep 8, 2020 at 7:05 PM Seth Wiesman <
> >> >> sjwies...@gmail.com
> >> >>>>
> >> >>>>>>>>>> wrote:
> >> >>>>>>>>>>>> Hi Devs,
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I'd like to propose an update to how state backends and
> >> >>> checkpoint
> >> >>>>>>>>>> storage
> >> >>>>>>>>>>>> are configured to help users better understand Flink.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Apache Flink's durability story is a mystery to many users.
> >> >> One
> >> >>>>>>>>>>>> of the
> >> >>>>>>>>>> most
> >> >>>>>>>>>>>> common recurring questions from users comes from not
> >> >>>>>>>>>>>> understanding the
> >> >>>>>>>>>>>> relationship between state, state backends, and snapshots.
> >> >> Some
> >> >>>>>>>>>>>> of this
> >> >>>>>>>>>>>> confusion can be abated with learning material but the
> >> >> question
> >> >>>>>>>>>>>> is so
> >> >>>>>>>>>>>> pervasive that we believe Flink’s user APIs should be
> better
> >> >>>>>>>>>> communicate
> >> >>>>>>>>>>>> what different components are responsible for.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I look forward to a healthy discussion.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Seth
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >>
>

Reply via email to