Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

LuNing Wang Fri, 06 May 2022 02:05:19 -0700

Thanks, Shengkai for driving.  And all for your discussion.



> intergate the Gateway into the Flink code base

After I talk with Shengkai offline and read the topic `Powering HTAP at
ByteDance with Apache Flink` of Flink Forward Asia. I think it is better to
integrate Gateway code into the Flink codebase.


In the future, we can add a feature that merges SQL gateway into
JobManager. We can request JobManager API to directly submit the Flink SQL
job. It will further improve the performance of Flink OLAP.  In the future,
the Flink must be a unified engine for batch, stream, and OLAP. The
Presto/Trino directly requests the master node to submit a job, if so, we
can reduce Q&M in Flink session mode. Perhaps, the Flink application mode
can’t merge SQL gateway into JobManager, but Flink OLAP almost always uses
session mode.

> Gateway to support multiple Flink versions


If we will merge the SQL gateway into JobManager, the SQL Gateway itself
can adapt only one Flink version. We could import a Network Gateway to
redirect requests to Gateway or JobManager of various versions. Perhaps,
the network gateway uses other projects, like Apache Kyuubi or Zeppelin,
etc.

> I don't think that the Gateway is a 'core' function of Flink which should

be included with Flink.

In the production environment, Flink SQL always uses a Gateway. This point
can be observed in the user email lists and some Flink Forward topics. The
SQL Gateway is an important infrastructure for big data compute engine. As
the Flink has not it, many Flink users achieve SQL Gateway in the Apache
Kyuubi project, but it should be the work of official Flink.

> I think it's fine to move this functionlity to the client rather than

gateway. WDYT?

I agree with the `init-file` option in the client. I think the `init-file`
functionality in Gateway is NOT important in the first version of Gateway.
Now, the hive JDBC option ‘initFile’ already has this functionality. After
SQL Gateway releases and we observe feedback from the community, we maybe
will discuss this problem again.

Best,

LuNing Wang.


Shengkai Fang <fskm...@gmail.com> 于2022年5月6日周五 14:34写道：

> Thanks Martijn, Nicholas, Godfrey, Jark and Jingsong feedback
>
> > I would like to understand why it's complicated to make the upgrades
> > problematic
>
> I aggree with Jark's point. The API is not very stable in the Flink
> actually. For example, the Gateway relies on the planner. But in
> release-1.14 Flink renames the blink planner package. In release-1.15 Flink
> makes the planner scala free, which means other projects should not
> directly rely on the planner.
>
> >  Does the Flink SQL gateway support submitting a batch job?
>
> Of course. In the SQL Gateway, you can just use the sql SET
> 'execution.runtime-mode' = 'batch' to switch to the batch environment. Then
> the job you submit later will be executed in the batch mode.
>
> > The architecture of the Gateway is in the following graph.
> Is the TableEnvironment shared for all sessions ?
>
> No. Every session has its individual TableEnvironment. I have modified the
> graph to make everything more clear.
>
> > /v1/sessions
> >> Are both local file and remote file supported for `libs` and `jars`?
>
> We don't limit the usage here. But I think we will only support the local
> file in the next version.
>
> >> Does sql gateway support upload files?
>
> No. We need a new API to do this. We can collect more user feedback to
> determine whether we need to implement this feature.
>
> >/v1/sessions/:session_handle/configure_session
> >> Can this api be replaced with `/v1/sessions/:session_handle/statements`
> ?
>
> Actually the API above is different. The
> `/v1/sessions/:session_handle/configure_session` API uses SQL to configure
> the environment, which only allows the limited types of SQL. But the
> `/v1/sessions/:session_handle/statements` has no limitation. I think we'd
> better use a different API to distinguish these.
>
> >/v1/sessions/:session_id/operations/:operation_handle/status
> >>`:session_id` is a typo, it should be `:session_handdle`
>
> Yes. I have fixed the mistake.
>
> >/v1/sessions/:session_handle/statements
> >The statement must be a single command
>
> >> Does this api support `begin statement set ... end` or `statement set
> >> begin ... end`?
>
> For BEGIN STATEMENT SET, it will open a buffer in the Session and allows
> the users to submit the insert statement into the Session later. When the
> Session receives the END statement, the Gateway will submit the buffered
> statements.
>
> For STATEMENT SET BEGIN ... END, the parser is able to parse the statement.
> We can treat it as other SQL.
>
> >> DO `ADD JAR`, `REMOVE JAR` support ? If yes, how to manage the jars?
>
> For ADD JAR/REMOVE JAR, if the jar is in the local environment, we will
> just add it into the class path or remove it from the class path. If the
> jar is the remote jar, we will create a session level directory and
> download the jar into the directory. When the session closes, it should
> also clean up all the resources in the session-level directory.
>
> I have updated the FLIP to add more info about these.
>
> >/v1/sessions/:session_handle/operations/:operation_handle/result/:token
> >"type": # string value of LogicalType
> >> Some LogicalTypes can not be serialized, such as: CharType(0)
>
> I think it's about the serialization of the LogicalType. We can follow the
> behaviour in the LogicalTypeJsonSerializer.
>
> > endpoint.protocol
> >>I think REST is not a kind of protocol[1], but is an architectural style.
> >> The value should be `HTTP`.
>
> I still prefer to use the REST as the value because REST also allows the
> HTTPS as the protocol. After offline discussion with Godfrey, we think it's
> better to use the 'endpoint.type' instead.
>
> >  Catalog API
> > ...
> >> I think we should avoid providing such api, because once catalog api
> >> is changed or added,
> >> This class should also be changed. SQL statement is a more general
> interface.
>
> The exposed API is used by the endpoint to organize its required output.
> The main problem in your plan is that it requires us to parse the data from
> the RowData, which only contains the basic types. I think it's much more
> difficult to maintain compared to the current plan that returns structured
> objects. I think the GatewayService relies on the Catalog but it doesn't
> mean the GatewayService should expose all the API exposed by the Catalog.
>
> > Options
> >> sql-gateway.session.idle.timeout
> >> sql-gateway.session.check.interval
> >> sql-gateway.worker.keepalive.time
>
> Okay. I have updated the FLIP about the option names.
>
> >why do we need both the rest api and the SQLGatewayService
> >API, maybe I'm missing something, what's the difference between them?
>
> REST API is the user interface. The REST API transforms the request to the
> invocation of the SQLGatewayService that is the one doing the work. We
> split the Gateway into the SQLGatewayService and Endpoint(REST API) and its
> benefit is that all the Endpoints share the same SQLGatewayService.
>
> > Is it possible to use one set of rest api to solve all the problems?
>
> I think we can't. I don't understand the meaning of all the problems. We
> can use the REST API to expose all the functionalities in the Gateway side.
> But many users may have their tools to communicate to the Gateway, which
> may be based on the HiveServer2 API(thrift api).
>
> Best,
> Shengkai
>
>
>
>
>
>
>
> Jingsong Li <jingsongl...@gmail.com> 于2022年5月6日周五 09:16写道：
>
> > Thanks Shengkai for driving.  And all for your discussion.
> >
> > > The reason why we introduce the gateway with pluggable endpoints is
> that
> > many users has their preferences. For example, the HiveServer2 users
> prefer
> > to use the gateway with HiveServer2-style API, which has numerous tools.
> > However, some filnk-native users may prefer to use the REST API.
> Therefore,
> > we hope to learn from the Kyuubi's design that expose multiple endpoints
> > with different API that allow the user to use.
> >
> > My understanding is that we need multiple endpoints, But I don't quite
> > understand why we need both the rest api and the SQLGatewayService
> > API, maybe I'm missing something, what's the difference between them?
> > Is it possible to use one set of rest api to solve all the problems?
> >
> > > Gateway to support multiple Flink versions
> >
> > I think this is a good question to consider.
> > - First of all, I think it is absolutely impossible for gateway to
> > support multiple versions of Flink under the current architecture,
> > because gateway relies on Flink SQL and a lot of SQL compiled and
> > optimized code is bound to the Flink version.
> > - The other way is that gateway does not rely on Flink SQL, and each
> > time a different version of Flink Jar is loaded to compile the job at
> > once, and frankly speaking, stream jobs actually prefer this model.
> >
> > The benefit of gateway support for multiple versions is that it's
> > really more user-friendly. I've seen cases where users must have
> > multiple versions existing in a cluster, and if each version needs to
> > run a gateway, the O&M burden will be heavy.
> >
> > > I don't think that the Gateway is a 'core' function of Flink which
> > should be included with Flink.
> >
> > First, I think the Gateway is a 'core' function in Flink.
> > Why?
> > I think our architecture should be consistent, which means that Flink
> > sql-client should use the implementation of gateway, which means that
> > sql-client depends on gateway.
> > And sql-client is the basic tool of flink sql, it must exist in flink
> > repository, otherwise flink sql has no most important entrance.
> > So, the gateway itself should be our core functionality as well.
> >
> > Best,
> > Jingsong
> >
> > On Thu, May 5, 2022 at 10:06 PM Jark Wu <imj...@gmail.com> wrote:
> > >
> > > Hi Martijn,
> > >
> > > Regarding maintaining Gateway inside or outside Flink code base,
> > > I would like to share my thoughts:
> > >
> > > > I would like to understand why it's complicated to make the upgrades
> > > problematic. Is it because of relying on internal interfaces? If so,
> > should
> > > we not consider making them public?
> > >
> > > It's not about internal interfaces. Flink itself doesn't provide
> backward
> > > compatibility for public APIs.
> > >
> > >
> > > > a) it will not be possible to have separate releases of the Gateway,
> > > they will be tied to individual Flink releases
> > > I don't think it's a problem. On the contrary, maintaining a separate
> > repo
> > > for Gateway will take a lot of
> > > extra community efforts, e.g., individual CICD, docs, releases.
> > >
> > >
> > > > b) if you want the Gateway to support multiple Flink versions
> > > Sorry, I don't see any users requesting this feature for such a long
> time
> > > for SQL Gateway.
> > > Users can build services on Gateway to easily support multi Flink
> > versions
> > > (a Gateway for a Flink version).
> > > It's difficult for Gateway to support multi-version because Flink
> doesn't
> > > provide an API that supports backward and forward compatibility.
> > > If Gateway wants to support multi-version, it has to invent an
> > > inner-gateway for each version, and Gateway act as a proxy to
> communicate
> > > with inner-gateway.
> > > So you have to have a gateway to couple with the Flink version.
> > >
> > > In fact, Gateway is the layer to support multi Flink versions for
> > > higher-level applications because its API (REST, gRpc) provides
> backward
> > > and forward compatibility.
> > > The gateway itself doesn't need to support multi Flink versions.
> Besides,
> > > Trino/Presto also provides servers[1] for each version.
> > >
> > >
> > > > I don't think that the Gateway is a 'core' function of Flink which
> > should
> > > be included with Flink.
> > > Sorry, I can't agree with this. If I remember correctly, Flink SQL has
> > been
> > > promoted to first-class citizen for a long time.
> > > The community also aims to make Flink a truly batch-stream unified
> > > computing platform, and Gateway would be the entry and center of the
> > > platform.
> > > From my point of view, Gateway is a very "core" function and must be
> > > included in Flink to have better cooperation with SQL and provide an
> > > out-of-box experience.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]: https://trino.io/download.html
> > >
> > > On Thu, 5 May 2022 at 19:57, godfrey he <godfre...@gmail.com> wrote:
> > >
> > > > Hi Shengkai.
> > > >
> > > > Thanks for driving the proposal, it's been silent too long.
> > > >
> > > > I have a few questions:
> > > > about the Architecture
> > > > > The architecture of the Gateway is in the following graph.
> > > > Is the TableEnvironment shared for all sessions ?
> > > >
> > > > about the REST Endpoint
> > > > > /v1/sessions
> > > > Are both local file and remote file supported for `libs` and `jars`?
> > > > Does sql gateway support upload file?
> > > >
> > > > >/v1/sessions/:session_handle/configure_session
> > > > Can this api be replaced with
> > `/v1/sessions/:session_handle/statements` ?
> > > >
> > > > >/v1/sessions/:session_id/operations/:operation_handle/status
> > > > `:session_id` is a typo, it should be `:session_handdle`
> > > >
> > > > >/v1/sessions/:session_handle/statements
> > > > >The statement must be a single command
> > > > Does this api support `begin statement set ... end` or `statement set
> > > > begin ... end`
> > > >  DO `ADD JAR`, `REMOVE JAR` support ? If yes, how to manage the jars?
> > > >
> > > >
> > >/v1/sessions/:session_handle/operations/:operation_handle/result/:token
> > > > >"type": # string value of LogicalType
> > > >  Some LogicalTypes can not be serialized, such as: CharType(0)
> > > >
> > > > about Options
> > > > > endpoint.protocol
> > > > I think REST is not a kind of protocol[1], but is an architectural
> > style.
> > > > The value should be `HTTP`.
> > > >
> > > > about SQLGatewayService API
> > > > >  Catalog API
> > > > > ...
> > > > I think we should avoid providing such api, because once catalog api
> > > > is changed or added,
> > > > This class should also be changed. SQL statement is a more general
> > > > interface.
> > > >
> > > > > Options
> > > > > sql-gateway.session.idle.timeout
> > > > >sql-gateway.session.check.interval
> > > > >sql-gateway.worker.keepalive.time
> > > > It's better we can keep the option style as Flink, the level should
> > > > not be too deep.
> > > > sql-gateway.session.idle.timeout -> sql-gateway.session.idle-timeout
> > > > sql-gateway.session.check.interval ->
> > sql-gateway.session.check-interval
> > > > sql-gateway.worker.keepalive.time ->
> sql-gateway.worker.keepalive->time
> > > >
> > > > [1] https://restfulapi.net/
> > > >
> > > > Best,
> > > > Godfrey
> > > >
> > > > Nicholas Jiang <nicholasji...@apache.org> 于2022年5月5日周四 14:58写道：
> > > > >
> > > > > Hi Shengkai,
> > > > >
> > > > > I have another concern about the submission of batch job. Does the
> > Flink
> > > > SQL gateway support to submit batch job? In Kyuubi,
> > BatchProcessBuilder is
> > > > used to submit batch job. What about the Flink SQL gateway?
> > > > >
> > > > > Best regards,
> > > > > Nicholas Jiang
> > > > >
> > > > > On 2022/04/24 03:28:36 Shengkai Fang wrote:
> > > > > > Hi. Jiang.
> > > > > >
> > > > > > Thanks for your feedback！
> > > > > >
> > > > > > > Do the public interfaces of GatewayService refer to any
> service?
> > > > > >
> > > > > > We will only expose one GatewayService implementation. We will
> put
> > the
> > > > > > interface into the common package and the developer who wants to
> > > > implement
> > > > > > a new endpoint can just rely on the interface package rather than
> > the
> > > > > > implementation.
> > > > > >
> > > > > > > What's the behavior of SQL Client Gateway working on Yarn or
> K8S?
> > > > Does
> > > > > > the SQL Client Gateway support application or session mode on
> Yarn?
> > > > > >
> > > > > > I think we can support SQL Client Gateway to submit the jobs in
> > > > > > application/sesison mode.
> > > > > >
> > > > > > > Is there any event trigger in the operation state machine?
> > > > > >
> > > > > > Yes. I have already updated the content and add more details
> about
> > the
> > > > > > state machine. During the revise, I found that I mix up the two
> > > > concepts:
> > > > > > job submission and job execution. In fact, we only control the
> > > > submission
> > > > > > mode at the gateway layer. Therefore, we don't need to mapping
> the
> > > > > > JobStatus here. If the user expects that the synchronization
> > behavior
> > > > is to
> > > > > > wait for the completion of the job execution before allowing the
> > next
> > > > > > statement to be executed, then the Operation lifecycle should
> also
> > > > contains
> > > > > > the job's execution, which means users should set
> `table.dml-sync`.
> > > > > >
> > > > > > > What's the return schema for the public interfaces of
> > GatewayService?
> > > > > > Like getTable interface, what's the return value schema?
> > > > > >
> > > > > > The API of the GatewayService return the java objects and the
> > endpoint
> > > > can
> > > > > > organize the objects with expected schema. The return results is
> > also
> > > > list
> > > > > > the section ComponetAPI#GatewayService#API. The return type of
> the
> > > > > > GatewayService#getTable is `ContextResolvedTable`.
> > > > > >
> > > > > > > How does the user get the operation log?
> > > > > >
> > > > > > The OperationManager will register the LogAppender before the
> > Operation
> > > > > > execution. The Log Appender will hijack the logger and also write
> > the
> > > > log
> > > > > > that related to the Operation to another files. When users wants
> to
> > > > fetch
> > > > > > the Operation log, the GatewayService will read the content in
> the
> > > > file and
> > > > > > return.
> > > > > >
> > > > > > Best,
> > > > > > Shengkai
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Nicholas Jiang <nicholasji...@apache.org> 于2022年4月22日周五 16:21写道：
> > > > > >
> > > > > > > Hi Shengkai.
> > > > > > >
> > > > > > > Thanks for driving the proposal of SQL Client Gateway. I have
> > some
> > > > > > > knowledge of Kyuubi and have some questions about the design:
> > > > > > >
> > > > > > > 1.Do the public interfaces of GatewayService refer to any
> > service? If
> > > > > > > referring to HiveService, does GatewayService need interfaces
> > like
> > > > > > > getQueryId etc.
> > > > > > >
> > > > > > > 2.What's the behavior of SQL Client Gateway working on Yarn or
> > K8S?
> > > > Does
> > > > > > > the SQL Client Gateway support application or session mode on
> > Yarn?
> > > > > > >
> > > > > > > 3.Is there any event trigger in the operation state machine?
> > > > > > >
> > > > > > > 4.What's the return schema for the public interfaces of
> > > > GatewayService?
> > > > > > > Like getTable interface, what's the return value schema?
> > > > > > >
> > > > > > > 5.How does the user get the operation log?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Nicholas Jiang
> > > > > > >
> > > > > > > On 2022/04/21 06:42:30 Shengkai Fang wrote:
> > > > > > > > Hi, Flink developers.
> > > > > > > >
> > > > > > > > I want to start a discussion about the FLIP-91: Support Flink
> > SQL
> > > > > > > > Gateway[1]. Flink SQL Gateway is a service that allows users
> to
> > > > submit
> > > > > > > and
> > > > > > > > manage their jobs in the online environment with the
> pluggable
> > > > endpoints.
> > > > > > > > The reason why we introduce the Gateway with pluggable
> > endpoints
> > > > is that
> > > > > > > > many users have their preferences. For example, the
> HiveServer2
> > > > users
> > > > > > > > prefer to use the gateway with HiveServer2-style API, which
> has
> > > > numerous
> > > > > > > > tools. However, some filnk-native users may prefer to use the
> > REST
> > > > API.
> > > > > > > > Therefore, we propose the SQL Gateway with pluggable
> endpoint.
> > > > > > > >
> > > > > > > > In the FLIP, we also propose the REST endpoint, which has the
> > > > similar
> > > > > > > > APIs compared to the gateway in the
> > > > ververica/flink-sql-gateway[2]. At
> > > > > > > the
> > > > > > > > last, we discuss how to use the SQL Client to submit the
> > statement
> > > > to the
> > > > > > > > Gateway with the REST API.
> > > > > > > >
> > > > > > > > I am glad that you can give some feedback about FLIP-91.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Shengkai
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway
> > > > > > > > [2] https://github.com/ververica/flink-sql-gateway
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
>

Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

Reply via email to