Hi Timo,

I agree with Shengkai that we should keep Gateway in the Flink main repo.
Here are my thoughts:

1) It is not feasible and maintainable to invoke different versions of
Flink dependencies
in one JVM. I'm afraid it would be a nightmare when the code is messed up
with all the reflections.

2) Actually, it's very mature to support submitting multi-version by
deploying one
gateway process for one version. Many systems are following this
architecture as Shengkai
mentioned above, including the Ververica Platform. And I don't see any
problem with it.

3) As Jingsong said, SQL CLI should build on top of Gateway to share code
and have the ability to
 connect to Gateway. If SQL CLI is still kept in the Flink repo, we have to
put Gateway there as well.

4) Besides, Gateway is indispensable for a SQL engine (think of
Trino/Presto, Spark, Hive).
Otherwise, Flink will always be a processing system. With Gateway inside
the Flink repo,
we can provide an out-of-box experience as a SQL query engine. Users can
try out the gateway
for the latest version when a new version is released.

5) Last, Gateway inside the Flink repo can ensure the highest degree of
version compatibility.

Best,
Jark


On Thu, 12 May 2022 at 19:19, Shengkai Fang <fskm...@gmail.com> wrote:

> Hi Timo.
>
> Thanks for your feedback!
>
> > It would be great if you can copy/paste some of the tricky design
> decisions into the FLIP.
>
> Yes. I have already added a section about the `Rejected Alternatives`. It
> includes the discuss about the
> - TableInfo and FunctionInfo VS CatalogTable and CatalogFunction
> - Support the multi-version Flink in the Gateway VS Support the
> multi-version in the external Service
> - Merge Gateway into the Flink code base VS Support Gateway in the another
> repo
> ...
>
>
> > Separate code base
>
> Let me summarize all the discussion about the separate code base. Please
> correct me if anything is wrong.
>
>
> About support for the Gateway in the Flink code base.
>
>
> 1. It reduces the cost including tests, release and document. We don't need
> to upgrade the Gateway when Flink releases.
> 2. The SQL Client has the ability to submit the SQL to the Gateway. If we
> move the Gateway to the outside repo, the SQL Client doesn't have the
> related dependencies.
> 3. It is benefit for us to reuse code in the same repo.
>
> About supporting the Gateway in another repo.
>
> 1. It is easier to support the multi-version Flink;
>
>
> First of all, I think we should support the multiple Flink version with
> another component that uses the Gateway from the individual Flink releases.
> Zeppelin, Livy, and Kyuubi all use similar architecture.
>
> The current Gateway in the FLIP is bound to the specific Flink version,
> which needs to compile the SQL and submit the job to the cluster with the
> specified Flink version.
> We can introduce a service in the future(Maybe we can name it
> MultipleVersionFlinkService? ). It may tell the client where the Gateway
> with specified version is and the client just communicate with the Gateway
> later. It may also just parse the request,  convert the request to the REST
> API in the FLIP and send to the Gateway with specified version.
>
> Therefore, I think we should merge the SQL Gateway inside the Flink repo.
> The MultipleVersionFlinkService may be inside or outside the repo.
>
> I add more details in the `Rejected Alternatives` in the FLIP.
>
> > Missing REST API spec
>
> Good point! I add a section in the `Appendix` about the serialization of
> the ResultSet. It includes how to serialize the schema, RowData and
> exceptions.
>
> > Consistency of concepts
>
> configure_session VS initialize_session
>
> Okay. I think it's better to use the initialize_session.
>
> TableInfo + UserDefinedFunctionInfo  VS  CatalogTable and CatalogFunction
>
> The CatalogTable and CatalogFunction are much heavier than the TableInfo
> and UserDefinedFunction. The CatalogManager requires reading from the
> Catalog to get the schema. But in the listTables only care about the table
> name, which is much lighter. Therefore, we propose to use the TableInfo
> with required fields.
>
> > Result Retrieval
>
> Yes. Currently the API is only used for the preview. I added a section in
> the `Rejected Alternatives` about this.
>
> Best,
> Shengkai
>

Reply via email to