Re: Dynamic ad hoc query deployment strategy

Kostas Kloudas Mon, 23 Nov 2020 01:08:31 -0800

Hi Lalala,

Even in session mode, the jobgraph is created before the job is
executed. So all the above hold.
Although I am not super familiar with the catalogs, what you want is
that two or more jobs share the same readers of a source. This is not
done automatically in DataStream or DataSet and I am pretty sure that
also Table and SQL do not perform any cross-query optimization.

In addition, even if they did, are you sure that this would be enough
for your queries? THe users will submit their queries at any point in
time and this would mean that each query would start processing from
where the reader is at that point in time, which is arbitrary. Is this
something that satisfies your requirements?

I will also include Dawid in the discussion to see if he has anything
to add about the Table API and SQL.

Cheers,
Kostas

On Fri, Nov 20, 2020 at 7:47 PM lalala <lal...@activist.com> wrote:
>
> Hi Kostas,
>
> Thank you for your response.
>
> Is what you are saying valid for session mode? I can submit my jobs to the
> existing Flink session, will they be able to share the sources?
>
> We do register our Kafka tables to `GenericInMemoryCatalog`, and the
> documentation says `The GenericInMemoryCatalog is an in-memory
> implementation of a catalog. All objects will be available only for the
> lifetime of the session.`. I presume, in session mode, we can share Kafka
> source for multiple SQL jobs?
>
> That is not want we wanted for the best isolation, but if it is not possible
> with Flink, we are also good with session mode.
>
> Best regards,
>
>
>
>
>
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Dynamic ad hoc query deployment strategy

Reply via email to