Done: https://issues.apache.org/jira/browse/FLINK-7930
Best, Flavio On Thu, Oct 26, 2017 at 10:52 AM, Till Rohrmann <trohrm...@apache.org> wrote: > Hi Flavio, > > this kind of feature is indeed useful and currently not supported by > Flink. I think, however, that this feature is a bit trickier to implement, > because Tasks cannot currently initiate checkpoints/savepoints on their > own. This would entail some changes to the lifecycle of a Task and an extra > communication step with the JobManager. However, nothing impossible to do. > > Please open a JIRA issue with the description of the problem where we can > continue the discussion. > > Cheers, > Till > > On Thu, Oct 26, 2017 at 9:58 AM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi Flavio, >> >> Thanks for bringing up this topic. >> I think running periodic jobs with state that gets restored and persisted >> in a savepoint is a very valid use case and would fit the stream is a >> superset of batch story quite well. >> I'm not sure if this behavior is already supported, but think this would >> be a desirable feature. >> >> I'm looping in Till and Aljoscha who might have some thoughts on this as >> well. >> Depending on the discussion we should open a JIRA for this feature. >> >> Cheers, Fabian >> >> 2017-10-25 10:31 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >> >>> Hi to all, >>> in my current use case I'd like to improve one step of our batch >>> pipeline. >>> There's one specific job that ingest a tabular dataset (of Rows) and >>> explode it into a set of RDF statements (as Tuples). The objects we output >>> are a containers of those Tuples (grouped by a field). >>> Flink stateful streaming could be a perfect fit here because we >>> incrementally increase the state of those containers but we don't have to >>> spend a lot of time performing some GET operation to an external Key-value >>> store. >>> The big problem here is that the sources are finite and the state of the >>> job gets lost once the job ends, while I was expecting that Flink was >>> snapshotting the state of its operators before exiting. >>> >>> This idea was inspired by https://data-artisans.com/b >>> log/queryable-state-use-case-demo#no-external-store, whit the >>> difference that one can resume the state of the stateful application only >>> when required. >>> Do you think that it could be possible to support such a use case (that >>> we can summarize as "periodic batch jobs that pick up where they left")? >>> >>> Best, >>> Flavio >>> >> >> >