[GitHub] [incubator-kyuubi] iodone commented on issue #32: support for http frontend service

GitBox Wed, 11 Aug 2021 21:12:38 -0700


iodone commented on issue #32:
URL: https://github.com/apache/incubator-kyuubi/issues/32#issuecomment-897334401

@yanghua Thank you very much for your patient explanation, I was able to get
your point. Let's discuss the next implementation based on

> a resource with a "statement" as the first-level resource

and discuss the subsequent implementation.

Here is just the main flow of the user submitting SQL, I can think of two
options.

### Solution 1
As @pan3793 mentioned above:

> We can make session info optional in operations APIs, then create a new
session if session info is absent, and return the session id to the client to
make sure the user can reuse the session in following request.

The underlying implementation is still by way of thrift cli interface, but
in the HTTP API only exposing the concept of statement, hiding the session.
When submitting a query, the server creates the session first, and then calls
executeStatement on the session:

![image](https://user-images.githubusercontent.com/5451385/129137132-edfa4550-e498-4600-9671-c48b8246b2e8.png)

#### Benefits
1. Simple implementation, as long as the implementation of a
HttpFrontendService, the user's request based on Statement, eventually
converted into an API call to BackendThriftService (to do some packaging,
transparent to the user)
2. Reuse Kyuubi most of the code implementation, keeping the overall
framework does not change.

#### Disadvantages
1. Hidden Session to the user, Session resource release timing is uncertain,
may lead to session resource leakage. By per query per session mechanism and
the mechanism of timing checks to avoid?
2. You can see that the HTTP API is a short connection, while Kyuubi and
Kyuubi engine directly establish a long link, which means that a Session state
is maintained in memory to maintain the connection with Kyuubi Engine. We know
that at least two HTTP API requests are required to execute a complete HTTP API
SQL query.When Kyuubi is extended to multiple instances, since the HTTP API can
be distributed to any instance, it is possible that the first two HTTP API
requests will be distributed to different instances of Kyuubi, so that the
second HTTP API request will not find the correct Session because the Session
is stored on the other instance. Possible need to introduce a Session sharing
mechanism (introducing a three-party persistent storage component)?

### Solution 2
Based on @yanghua's point of view:

> we will only need to follow RESTful principles, and our core focus is on
resources.

Another set of fully RESTful-based architectural solutions, entirely
different with the Thrift CLI interface:

![image](https://user-images.githubusercontent.com/5451385/129137165-06be595d-68c3-4a7c-b0ff-86c8b1ced163.png)

#### Benefits
1. Push the state down to the Kyuubi Engine to maintain. Kyuubi only does
the forwarding of user requests, no need to maintain Sessions in memory
2. Facilitate the ability to extend, based on the Kyuubi Engine HTTP API we
can also extend some non-standard SQL or JDBC capabilities, like MLSQL:
https://www.mlsql.tech/

#### Disadvantages
1. Need to fully implement a set of HTTP API based on RESTful principles,
compared to Option 1 workload is larger

Currently Solution-1 in our production environment has landed, and is
currently on trial. But I personally prefer Solution-2.

@yanghua @pan3793 WDYT?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-kyuubi] iodone commented on issue #32: support for http frontend service

Reply via email to