Re: State bootstrapping for Flink SQL / Table API jobs

Shammon FY Sun, 23 Apr 2023 17:37:59 -0700

Hi Илья

I think HybridSource may be a good way. Have you tried it before? Or have
you encountered any problems?


Best,
Shammon FY

On Fri, Apr 21, 2023 at 5:59 PM Илья Соин <ilya.soin...@gmail.com> wrote:

> Hi Flink community,
>
> We have a quite complex sql job, it unions 5 topics, deduplicates by key
> and does some daily aggregations. The state TTL is 40 days. We want to be
> able to bootstrap its state from s3 or clickhouse. We want to have a
> general solution to this, to use for other SQL jobs as well.
>
> So far I haven’t found a working solution to this. I’d like to discuss
> what’s the best approach to take here and possibly contribute in to Flink.
>
> I think a good solution would be to bring HybridSource to Table / SQL API.
>
> Another thought was to take the SQL, replace unbounded sources with
> bounded ones, and run the job. Then take a savepoint in the end and use it
> to bootstrap the streaming job. The problems I see here:
> - we have no control over operator uuids and the final table plan, it’s
> possible the plan of the batch job will be slightly different than of the
> streaming job.
>
>
> --
> *Sincerely,*
> *Ilya Soin*
>

Re: State bootstrapping for Flink SQL / Table API jobs

Reply via email to