Hi, Shengkai.

Thanks for the update, LGTM now.

Best,
Godfrey


Shengkai Fang <fskm...@gmail.com> 于2022年6月6日周一 16:47写道:
>
> Hi. Godfrey.
>
> Nice to hear the comments from you.
>
> > Could you give a whole architecture about the Ecosystem of HiveServers
> > and the SqlGateway, such as JDBC driver, Beeline, etc.
> > Which is more clear for users.
>
> Yes. I have updated the FLIP and added the architecture of the Gateway with
> the HiveServer2 endpoint.
>
> > How To Use
> >> Could you give a complete example to describe an end-to-end case?
>
> Yes. I have updated the FLIP. The beeline users can just use the connect
> command to connect to the SQLGateway with the HiveServer2 endpoint.
> For example, users just inputs "!connect
> jdbc:hive2://<host>:<port>/<db>;auth=noSasl
> hiveuser pass" into the terminal to connect to the SQLGateway.
>
> > Is the streaming SQL supported? What's the behavior if I submit a
> streaming query or I change the dialect to 'default'?
> Yes. We don't limit the usage here. Users can switch to the streaming mode
> or use the default dialect.  But we don't suggest users use the hive
> dialect in the streaming mode. As far as I know, it has some problems that
> are not fixed yet, e.g. you may get errors for SQL that works in the batch
> mode. I added a section to mention this.
>
> > Considering the different users may have different requirements to
> connect to different meta stores,
> > they can use the DDL to register the HiveCatalog that satisfies their
> requirements.
> >> Could you give some examples to explain it more?
>
> Hive supports setting multiple metastore addresses via the config option
> "hive.metastore.urls". Here I just mean users can switch to connect to
> different metastore instances using the CREATE CATALOG DDL. I updated the
> FLIP to make it more clear.
>
> Best,
> Shengkai
>
> godfrey he <godfre...@gmail.com> 于2022年6月6日周一 13:45写道:
>
> > Hi Shengkai,
> >
> > Thanks for driving this.
> >
> > I have a few comments:
> >
> > Could you give a whole architecture about the Ecosystem of HiveServers
> > and the SqlGateway, such as JDBC driver, Beeline, etc.
> > Which is more clear for users.
> >
> > > Considering the different users may have different requirements to
> > connect to different meta stores,
> > > they can use the DDL to register the HiveCatalog that satisfies their
> > requirements.
> >  Could you give some examples to explain it more?
> >
> > > How To Use
> > Could you a complete example to describe an end-to-end case?
> >
> > Is the streaming sql supported? What's the behavior if I submit streaming
> > query
> > or I change the dialect to 'default'?
> >
> > Best,
> > Godfrey
> >
> > Shengkai Fang <fskm...@gmail.com> 于2022年6月1日周三 21:13写道:
> > >
> > > Hi, Jingsong.
> > >
> > > Thanks for your feedback.
> > >
> > > > I've read the FLIP and it's not quite clear what the specific
> > unsupported
> > > items are
> > >
> > > Yes. I have added a section named Difference with HiveServer2 and list
> > the
> > > difference between the SQL Gateway with HiveServer2 endpoint and
> > > HiveServer2.
> > >
> > > > Support multiple metastore clients in one gateway?
> > >
> > > Yes. It may cause class conflicts when using the different versions of
> > Hive
> > > Catalog at the same time. I add a section named "How to use" to remind
> > the
> > > users don't use HiveCatalog with different versions together.
> > >
> > > >  Hive versions and setup
> > >
> > > Considering the HiveServer2 endpoint binds to the HiveCatalog, we will
> > not
> > > introduce a new module about the HiveServer2 endpoint. The current
> > > dependencies in the hive connector should be enough for the HiveServer2
> > > Endpoint except for the hive-service-RPC(it contains the HiveServer2
> > > interface). In this way, the hive connector jar will contain an
> > endpoint. I
> > > add a section named "Merge HiveServer2 Endpoint into Hive Connector
> > > Module".
> > >
> > > For usage, the user can just add the hive connector jar into the
> > classpath
> > > and use the sql-gateway.sh to start the SQL Gateway with the hiveserver2
> > > endpoint.  You can refer to the section "How to use" for more details.
> > >
> > > Best,
> > > Shengkai
> > >
> > > Jingsong Li <jingsongl...@gmail.com> 于2022年6月1日周三 15:04写道:
> > >
> > > > Hi Shengkai,
> > > >
> > > > Thanks for driving.
> > > >
> > > > I have a few comments:
> > > >
> > > > ## Unsupported features
> > > >
> > > > I've read the FLIP and it's not quite clear what the specific
> > unsupported
> > > > items are?
> > > > - For example, security related, is it not supported.
> > > > - For example, is there a loss of precision for types
> > > > - For example, the FetchResults are not the same
> > > >
> > > > ## Support multiple metastore clients in one gateway?
> > > >
> > > > > During the setup, the HiveServer2 tires to load the config in the
> > > > hive-site.xml to initialize the Hive metastore client. In the Flink,
> > we use
> > > > the Catalog interface to connect to the Hive Metastore, which is
> > allowed to
> > > > communicate with different Hive Metastore[1]. Therefore, we allows the
> > user
> > > > to specify the path of the hive-site.xml as the endpoint parameters,
> > which
> > > > will used to create the default HiveCatalog in the Flink. Considering
> > the
> > > > different users may have different requirements to connect to different
> > > > meta stores, they can use the DDL to register the HiveCatalog that
> > > > satisfies their requirements.
> > > >
> > > > I understand it is difficult. You really want to support?
> > > >
> > > > ## Hive versions and setup
> > > >
> > > > I saw jark also commented, but FLIP does not seem to have been
> > modified,
> > > > how should the user setup, which jar to add, which hive metastore
> > version
> > > > to support? How to setup to support?
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Tue, May 24, 2022 at 11:57 AM Shengkai Fang <fskm...@gmail.com>
> > wrote:
> > > >
> > > > > Hi, all.
> > > > >
> > > > > Considering we start to vote for FLIP-91 for a while, I think we can
> > > > > restart the discussion about the FLIP-223.
> > > > >
> > > > > I am glad that you can give some feedback about FLIP-223.
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > >
> > > > > Martijn Visser <mart...@ververica.com> 于2022年5月6日周五 19:10写道:
> > > > >
> > > > > > Hi Shengkai,
> > > > > >
> > > > > > Thanks for clarifying.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > > On Fri, 6 May 2022 at 08:40, Shengkai Fang <fskm...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi Martijn.
> > > > > > >
> > > > > > > > So this implementation would not rely in any way on Hive, only
> > on
> > > > > > Thrift?
> > > > > > >
> > > > > > > Yes.  The dependency is light. We also can just copy the iface
> > file
> > > > > from
> > > > > > > the Hive repo and maintain by ourselves.
> > > > > > >
> > > > > > > Best,
> > > > > > > Shengkai
> > > > > > >
> > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年5月4日周三 21:44写道:
> > > > > > >
> > > > > > > > Hi Shengkai,
> > > > > > > >
> > > > > > > > > Actually we will only rely on the API in the Hive, which only
> > > > > > contains
> > > > > > > > the thrift file and the generated code
> > > > > > > >
> > > > > > > > So this implementation would not rely in any way on Hive, only
> > on
> > > > > > Thrift?
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > >
> > > > > > > > Martijn Visser
> > > > > > > > https://twitter.com/MartijnVisser82
> > > > > > > > https://github.com/MartijnVisser
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, 29 Apr 2022 at 05:16, Shengkai Fang <fskm...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Jark and Martijn
> > > > > > > > >
> > > > > > > > > Thanks for your feedback.
> > > > > > > > >
> > > > > > > > > > Kyuubi provides three ways to configure Hive metastore [1].
> > > > Could
> > > > > > we
> > > > > > > > > provide similar abilities?
> > > > > > > > >
> > > > > > > > > Yes. I have updated the FLIP about this and it takes some
> > time to
> > > > > > > figure
> > > > > > > > > out how the jdbc driver works. I added the section about how
> > to
> > > > use
> > > > > > the
> > > > > > > > > hive JDBC to configure the session-level catalog.
> > > > > > > > >
> > > > > > > > > > I think we can improve the "HiveServer2 Compatibility"
> > section.
> > > > > > > > >
> > > > > > > > > Yes. I have updated the FLIP and added more details about the
> > > > > > > > > compatibility.
> > > > > > > > >
> > > > > > > > > >  Prefer to first complete the discussion and vote on
> > FLIP-91
> > > > then
> > > > > > > > discuss
> > > > > > > > > FLIP-223
> > > > > > > > >
> > > > > > > > > Of course. We can wait until the discussion of the FLIP-91
> > > > > finishes.
> > > > > > > > >
> > > > > > > > > > Maintenance concerns about the hive
> > > > > > > > >
> > > > > > > > > Actually we will only rely on the API in the Hive, which only
> > > > > > contains
> > > > > > > > the
> > > > > > > > > thrift file and the generated code[1]. I think it will not
> > > > > influence
> > > > > > us
> > > > > > > > to
> > > > > > > > > upgrade the java version.
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/hive/tree/master/service-rpc
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Shengkai
> > > > > > > > >
> > > > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年4月26日周二
> > 20:44写道:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I'm not too familiar with Hive and HiveServer2, but I do
> > have a
> > > > > > > couple
> > > > > > > > of
> > > > > > > > > > questions/concerns:
> > > > > > > > > >
> > > > > > > > > > 1. What is the relationship between this FLIP and FLIP-91?
> > My
> > > > > > > > assumption
> > > > > > > > > > would be that this FLIP (and therefore the HiveServer2)
> > > > > > > implementation
> > > > > > > > > > would need to be integrated in the REST Gateway, is that
> > > > correct?
> > > > > > If
> > > > > > > > so,
> > > > > > > > > I
> > > > > > > > > > would prefer to first complete the discussion and vote on
> > > > > FLIP-91,
> > > > > > > else
> > > > > > > > > > we'll have two moving FLIPs who have a direct relationship
> > with
> > > > > > each
> > > > > > > > > other.
> > > > > > > > > >
> > > > > > > > > > 2. While I understand that Hive is important (in the
> > Chinese
> > > > > > > ecosystem,
> > > > > > > > > not
> > > > > > > > > > so much in Europe and the US), I still have maintenance
> > > > concerns
> > > > > on
> > > > > > > > this
> > > > > > > > > > topic. We know that the current Hive integration isn't
> > exactly
> > > > > > ideal
> > > > > > > > and
> > > > > > > > > > requires a lot of work to get in better shape. At the same
> > > > time,
> > > > > > Hive
> > > > > > > > > still
> > > > > > > > > > doesn't support Java 11 while we need (and should, given
> > the
> > > > > > premier
> > > > > > > > > > support has ended already) to move away from Java 8.
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > >
> > > > > > > > > > Martijn Visser
> > > > > > > > > > https://twitter.com/MartijnVisser82
> > > > > > > > > > https://github.com/MartijnVisser
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, 25 Apr 2022 at 12:13, Jark Wu <imj...@gmail.com>
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thank Shengkai for driving this effort,
> > > > > > > > > > > I think this is an essential addition to Flink Batch.
> > > > > > > > > > >
> > > > > > > > > > > I have some small suggestions:
> > > > > > > > > > > 1) Kyuubi provides three ways to configure Hive metastore
> > > > [1].
> > > > > > > Could
> > > > > > > > we
> > > > > > > > > > > provide similar abilities?
> > > > > > > > > > > Especially with the JDBC Connection URL, users can visit
> > > > > > different
> > > > > > > > Hive
> > > > > > > > > > > metastore server instances.
> > > > > > > > > > >
> > > > > > > > > > > 2) I think we can improve the "HiveServer2 Compatibility"
> > > > > > section.
> > > > > > > > > > > We need to figure out two compatibility matrices. One is
> > SQL
> > > > > > > Gateway
> > > > > > > > > with
> > > > > > > > > > > different versions of Hive metastore,
> > > > > > > > > > > and the other is different versions of Hive client (e.g.,
> > > > Hive
> > > > > > > JDBC)
> > > > > > > > > with
> > > > > > > > > > > SQL Gateway. We need to clarify
> > > > > > > > > > > what metastore and client versions we support and how
> > users
> > > > > > > configure
> > > > > > > > > the
> > > > > > > > > > > versions.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jark
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > [1]:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > https://kyuubi.apache.org/docs/r1.3.1-incubating/deployment/hive_metastore.html#activate-configurations
> > > > > > > > > > >
> > > > > > > > > > > On Sun, 24 Apr 2022 at 15:02, Shengkai Fang <
> > > > fskm...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Jiang.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for your feedback!
> > > > > > > > > > > >
> > > > > > > > > > > > > Integrating the Hive ecosystem should not require
> > > > changing
> > > > > > the
> > > > > > > > > > service
> > > > > > > > > > > > interface
> > > > > > > > > > > >
> > > > > > > > > > > > I move the API change to the FLIP-91. But I think it's
> > > > > possible
> > > > > > > we
> > > > > > > > > add
> > > > > > > > > > > more
> > > > > > > > > > > > interfaces to intergrate the new endpoints in the
> > future
> > > > > > because
> > > > > > > > > every
> > > > > > > > > > > > endpoints's functionality is different. For example,
> > the
> > > > REST
> > > > > > > > > endpoint
> > > > > > > > > > > > doen't support to fetch operation-level logs but the
> > > > > > hiveserver2
> > > > > > > > > > endpoint
> > > > > > > > > > > > supports. In this case, we need to modify the shared
> > > > > > > GatewayService
> > > > > > > > > to
> > > > > > > > > > > > support the functionality exposed by the new endpint.
> > > > > > > > > > > >
> > > > > > > > > > > > >  How to support different Hive versions?
> > > > > > > > > > > >
> > > > > > > > > > > > Do you means to support the different HiveServer2
> > version?
> > > > > The
> > > > > > > > > > > HiveServer2
> > > > > > > > > > > > uses the version to guarantee the compatibility.
> > During the
> > > > > > > > > > openSession,
> > > > > > > > > > > > the client and server will determine the protocol
> > > > > > > > > > version(minimun(client
> > > > > > > > > > > > version, hiveendpoint version)). After that the client
> > and
> > > > > the
> > > > > > > > server
> > > > > > > > > > > uses
> > > > > > > > > > > > the determined version to communicate. In the
> > HiveServer2
> > > > > > > endpoint,
> > > > > > > > > it
> > > > > > > > > > > > determines how the endpoint deserialize the results
> > and the
> > > > > > > result
> > > > > > > > > > > schema.
> > > > > > > > > > > > I add a section about HiveServer2 compatiblity.
> > > > > > > > > > > >
> > > > > > > > > > > > > Could you please fully provide its definition
> > including
> > > > > input
> > > > > > > > > > > parameters
> > > > > > > > > > > > and the corresponding return value schema?
> > > > > > > > > > > >
> > > > > > > > > > > > Because we implements the interface exposed by the
> > Hive.
> > > > So I
> > > > > > add
> > > > > > > > the
> > > > > > > > > > > file
> > > > > > > > > > > > link to the HiveServer2 interfaces[1], which contains
> > all
> > > > > input
> > > > > > > > > > > parameters
> > > > > > > > > > > > and the results. Considering the file doesn't contain
> > the
> > > > > > output
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > Operation, I add the output schema for all the
> > supported
> > > > > > > Operation
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > FLIP, which is not covered in the link. Hope these can
> > > > > address
> > > > > > > your
> > > > > > > > > > > > question.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Shengkai
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/hive/blob/branch-2.3/service-rpc/if/TCLIService.thrift#L1227
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Nicholas Jiang <nicholasji...@apache.org>
> > 于2022年4月22日周五
> > > > > > 16:43写道:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Shengkai.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for driving the proposal of HiveServer2
> > Endpoint
> > > > > > > support.
> > > > > > > > > For
> > > > > > > > > > > the
> > > > > > > > > > > > > "GatewayService API Change", I don't think the
> > motivation
> > > > > for
> > > > > > > > > > > supporting
> > > > > > > > > > > > > HiveServer2 endpoint need to change the
> > GatewayService
> > > > API,
> > > > > > in
> > > > > > > > > other
> > > > > > > > > > > > words,
> > > > > > > > > > > > > integrating the Hive ecosystem should not require
> > > > changing
> > > > > > the
> > > > > > > > > > service
> > > > > > > > > > > > > interface. If you confirm to change GatewayService
> > > > > interface,
> > > > > > > > IMO,
> > > > > > > > > > the
> > > > > > > > > > > > > proposal could be discussed in FLIP-91 because the
> > public
> > > > > > > > > interfaces
> > > > > > > > > > > are
> > > > > > > > > > > > > defined in FLIP-91.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In addtion, how to support different Hive versions
> > and
> > > > how
> > > > > to
> > > > > > > > > > guarantee
> > > > > > > > > > > > > compatibility is not mentioned in the design. What's
> > the
> > > > > > > behavior
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > compatibility?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Finally, for the public interfaces, could you please
> > > > fully
> > > > > > > > provide
> > > > > > > > > > its
> > > > > > > > > > > > > definition including input parameters and the
> > > > corresponding
> > > > > > > > return
> > > > > > > > > > > value
> > > > > > > > > > > > > schema?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Nicholas Jiang
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 2022/04/21 06:45:13 Shengkai Fang wrote:
> > > > > > > > > > > > > > Hi, Flink developers.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I want to start a discussion about the FLIP-223:
> > > > Support
> > > > > > > > > > HiveServer2
> > > > > > > > > > > > > > Endpoint[1]. The Endpoint will implement the thrift
> > > > > > interface
> > > > > > > > > > exposed
> > > > > > > > > > > > by
> > > > > > > > > > > > > > the HiveServer2, and users' BI, CLI and other tools
> > > > based
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > > > HiveServer2 can also be seamlessly migrated to the
> > > > Flink
> > > > > > SQL
> > > > > > > > > > Gateway.
> > > > > > > > > > > > > After
> > > > > > > > > > > > > > the FLIP finishes, the users can have almost the
> > same
> > > > > > > > experience
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > Flink SQL Gateway with the HiveServer2 endpoint as
> > in
> > > > the
> > > > > > > > > > > HiveServer2.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am glad that you can give some feedback about
> > > > FLIP-223.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Shengkai
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223+Support+HiveServer2+Endpoint
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >

Reply via email to