Hi Shengkai,

sorry for jumping in so late into the discussion. There was a lot going on in the last 3 weeks. This FLIP is very important and it is great that you tackle this topic which is planned since we started with the SQL Client.

The ML discussion is already quite long. It would be great if you can copy/paste some of the tricky design decisions into the FLIP. With pros and cons into a `Rejected Alternatives` section. Also the `Future work` setion does not mention your plans of `merge SQL gateway into JobManager`. So far this FLIP seems incomplete to me and a lot of important information is actually contained in the discussion thread instead of the design document.

For example, Godfrey's comment `Some LogicalTypes can not be serialized, such as: CharType(0)` was my thought as well when reading the FLIP today. Now I see that you actually answered this with `We can follow the behaviour in the LogicalTypeJsonSerializer`. Please update the FLIP for these feedback. If people have questions that the FLIP does not answer, then the FLIP should be updated. It is totally fine to add more sections and structure it further, look at how FLIP-196 evolved because many people had interest in this feature.

There is a lot of feedback from my side on this FLIP, some comments are nit picking, some comments are crucial. I will start with the crucial stuff first, and post my nit comments in a follow up discussion after the FLIP had another iteration.

1) Separate code base

I would also vote for a separate code base. The gateway should be able to submit to different Flink versions. And by looking at the interfaces of `SQLGatewayService` it looks like we are trying to abstract away the TableEnvironment, so an endpoint does not depend on internal Flink API. Thus, I don't think that frequently changing API should be an issue. We just need to design the `SQLGatewayService` per Flink version or with a shim layer. Also because the API is already quite stable with 1.15 and I'm sure we can mark more interfaces as @Public if that would help.

2) Missing REST API spec

The current FLIP skips some spec for REST API. For example, how do we serialize RowData in the REST API? We should have consensus here. The format could be in sync with built-in JSON_STRING()? Maybe it would also make sense to have 2 data formats (JSON + or serialized with our internal serializers?).

3) Consistency of concepts

The concept of an `init statement` should not be exclusive to the CLI client. It should be a concept of the gateway as well. Otherwise this could become quite complex if some statements can be used in `-i` but not in `/configure_session`. Also why not calling this `initialize_session` to be consistent.

TableInfo + UserDefinedFunctionInfo seem redundant to me, they have the same purpose as CatalogTable and CatalogFunction. What is TableKind.ALL?

4) Result Retrieval

Shall we make a difference between result retrieval for interactive sessions (just previews) vs. result retrieval for fault tolerant exactly-once results? The current design seems to be made for the first use case. A JDBC abstraction will not be able to provide consistent results in case of a failure. Maybe we should mention this in the FLIP?

Thanks,
Timo

Am 10.05.22 um 09:20 schrieb Jark Wu:
Thank Shengkai for driving this work!
+1 to start a VOTE.

Best,
Jark

On Tue, 10 May 2022 at 12:13, Shengkai Fang <fskm...@gmail.com> wrote:

Hi, everyone.

Thanks for all the inputs. Hope my feedback can address most of questions.

After the long discussion, I think we have reached the consensus about the
design. If the discussion doesn't get more response, I think we can start
the vote tomorrow.

Best,
Shengkai

Shengkai Fang <fskm...@gmail.com> 于2022年5月9日周一 14:18写道:

Hi JingSong.

Thanks for your feedback.

reorganize the FLIP, what Pluggable Endpoint Discovery is, and how
users
to add new Endpoint, before introducing SQLGatewayService.

I update the FLIP:  reorganize the order and add more details.

Then I have some doubts about the name SQLGatewayService, I think
Service is seriously misleading, it is just an implementation class
and not a Service.
After discuss with Jingsong offline, we agree to still use the
SQLGatewayService. According to the defination of the wiki[1], service
refers to a software functionality or a set of software functionalities
with a purpose that different clients can reuse for different purposes.
Here the GatewayService plays the same role.

Best,
Shengkai

[1] https://en.wikipedia.org/wiki/Service_(systems_architecture)

Jingsong Li <jingsongl...@gmail.com> 于2022年5月7日周六 16:55写道:

Hi Shengkai, thanks for your reply.

REST API is the user interface. The REST API transforms the request to
the
invocation of the SQLGatewayService that is the one doing the work. We
split the Gateway into the SQLGatewayService and Endpoint(REST API) and
its
benefit is that all the Endpoints share the same SQLGatewayService.

Finally, I got the point of SQLGatewayService, It is just for
`Pluggable Endpoint Discovery`.
I suggest you reorganize the FLIP, what Pluggable Endpoint Discovery
is, and how users to add new Endpoint, before introducing
SQLGatewayService.

Then I have some doubts about the name SQLGatewayService, I think
Service is seriously misleading, it is just an implementation class
and not a Service.

What about just `SQLGateway`?

Best,
Jingsong

On Sat, May 7, 2022 at 4:03 PM Shengkai Fang <fskm...@gmail.com> wrote:
Hi Martijn.

It seems we have reached consensus to support the Gateway inside the
Flink
code base.

Hi, Echo.

Thanks for your interest.

whether flink-sql-gateway should be retained in the Flink project.
I think the discussion above is clear. It is the essential tool to
provide
out-of-box experience for users.

For stream processing, what's the point of getting the result? Is it
just
for debugging and how to unify with batch processing

At the client side, when the OperationStaus is FINISHED, the client is
able
to fetch the results from the Gateway. It is unified with the batch
processing now.

For batch processing, does the gateway need to cache all fetch
results?
No. In the Gateway, we will only buffer partial data and wait for the
client to consume. If the client takes away the buffered data, the
Gateway
will clear the buffer and notify the fetcher thread starts to work
until
the buffer is full again. The mechanism behind is much like the
producer-consumer model[1].

[1] https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem

Whether executing query and fetch results should be synchronous or
asynchronous?

Do you mean the executeStatement response should also contain the
result? I
don't think using the asynchronous API will cause performance
regression.
In most cases, the network latency is about 100ms or lower. You can
ping
www.baidu.com or www.google.com to test the latency.

When executing a query in flink-sql-client, I often find error logs
of
FlinkJobNotFoundException. Should this be optimized?

It's related to the current client implementation. I think you can
open
a
jira ticket, add more details and we can discuss the problem in the
ticket.
In the FLIP-91, the Gateway can store the log per Operation. It may
solve
your problems.


Best,
Shengkai


Reply via email to