Hi Илья I think HybridSource may be a good way. Have you tried it before? Or have you encountered any problems?
Best, Shammon FY On Fri, Apr 21, 2023 at 5:59 PM Илья Соин <ilya.soin...@gmail.com> wrote: > Hi Flink community, > > We have a quite complex sql job, it unions 5 topics, deduplicates by key > and does some daily aggregations. The state TTL is 40 days. We want to be > able to bootstrap its state from s3 or clickhouse. We want to have a > general solution to this, to use for other SQL jobs as well. > > So far I haven’t found a working solution to this. I’d like to discuss > what’s the best approach to take here and possibly contribute in to Flink. > > I think a good solution would be to bring HybridSource to Table / SQL API. > > Another thought was to take the SQL, replace unbounded sources with > bounded ones, and run the job. Then take a savepoint in the end and use it > to bootstrap the streaming job. The problems I see here: > - we have no control over operator uuids and the final table plan, it’s > possible the plan of the batch job will be slightly different than of the > streaming job. > > > -- > *Sincerely,* > *Ilya Soin* >