Hi, Shengkai. Thanks for the update, LGTM now.
Best, Godfrey Shengkai Fang <fskm...@gmail.com> 于2022年6月6日周一 16:47写道: > > Hi. Godfrey. > > Nice to hear the comments from you. > > > Could you give a whole architecture about the Ecosystem of HiveServers > > and the SqlGateway, such as JDBC driver, Beeline, etc. > > Which is more clear for users. > > Yes. I have updated the FLIP and added the architecture of the Gateway with > the HiveServer2 endpoint. > > > How To Use > >> Could you give a complete example to describe an end-to-end case? > > Yes. I have updated the FLIP. The beeline users can just use the connect > command to connect to the SQLGateway with the HiveServer2 endpoint. > For example, users just inputs "!connect > jdbc:hive2://<host>:<port>/<db>;auth=noSasl > hiveuser pass" into the terminal to connect to the SQLGateway. > > > Is the streaming SQL supported? What's the behavior if I submit a > streaming query or I change the dialect to 'default'? > Yes. We don't limit the usage here. Users can switch to the streaming mode > or use the default dialect. But we don't suggest users use the hive > dialect in the streaming mode. As far as I know, it has some problems that > are not fixed yet, e.g. you may get errors for SQL that works in the batch > mode. I added a section to mention this. > > > Considering the different users may have different requirements to > connect to different meta stores, > > they can use the DDL to register the HiveCatalog that satisfies their > requirements. > >> Could you give some examples to explain it more? > > Hive supports setting multiple metastore addresses via the config option > "hive.metastore.urls". Here I just mean users can switch to connect to > different metastore instances using the CREATE CATALOG DDL. I updated the > FLIP to make it more clear. > > Best, > Shengkai > > godfrey he <godfre...@gmail.com> 于2022年6月6日周一 13:45写道: > > > Hi Shengkai, > > > > Thanks for driving this. > > > > I have a few comments: > > > > Could you give a whole architecture about the Ecosystem of HiveServers > > and the SqlGateway, such as JDBC driver, Beeline, etc. > > Which is more clear for users. > > > > > Considering the different users may have different requirements to > > connect to different meta stores, > > > they can use the DDL to register the HiveCatalog that satisfies their > > requirements. > > Could you give some examples to explain it more? > > > > > How To Use > > Could you a complete example to describe an end-to-end case? > > > > Is the streaming sql supported? What's the behavior if I submit streaming > > query > > or I change the dialect to 'default'? > > > > Best, > > Godfrey > > > > Shengkai Fang <fskm...@gmail.com> 于2022年6月1日周三 21:13写道: > > > > > > Hi, Jingsong. > > > > > > Thanks for your feedback. > > > > > > > I've read the FLIP and it's not quite clear what the specific > > unsupported > > > items are > > > > > > Yes. I have added a section named Difference with HiveServer2 and list > > the > > > difference between the SQL Gateway with HiveServer2 endpoint and > > > HiveServer2. > > > > > > > Support multiple metastore clients in one gateway? > > > > > > Yes. It may cause class conflicts when using the different versions of > > Hive > > > Catalog at the same time. I add a section named "How to use" to remind > > the > > > users don't use HiveCatalog with different versions together. > > > > > > > Hive versions and setup > > > > > > Considering the HiveServer2 endpoint binds to the HiveCatalog, we will > > not > > > introduce a new module about the HiveServer2 endpoint. The current > > > dependencies in the hive connector should be enough for the HiveServer2 > > > Endpoint except for the hive-service-RPC(it contains the HiveServer2 > > > interface). In this way, the hive connector jar will contain an > > endpoint. I > > > add a section named "Merge HiveServer2 Endpoint into Hive Connector > > > Module". > > > > > > For usage, the user can just add the hive connector jar into the > > classpath > > > and use the sql-gateway.sh to start the SQL Gateway with the hiveserver2 > > > endpoint. You can refer to the section "How to use" for more details. > > > > > > Best, > > > Shengkai > > > > > > Jingsong Li <jingsongl...@gmail.com> 于2022年6月1日周三 15:04写道: > > > > > > > Hi Shengkai, > > > > > > > > Thanks for driving. > > > > > > > > I have a few comments: > > > > > > > > ## Unsupported features > > > > > > > > I've read the FLIP and it's not quite clear what the specific > > unsupported > > > > items are? > > > > - For example, security related, is it not supported. > > > > - For example, is there a loss of precision for types > > > > - For example, the FetchResults are not the same > > > > > > > > ## Support multiple metastore clients in one gateway? > > > > > > > > > During the setup, the HiveServer2 tires to load the config in the > > > > hive-site.xml to initialize the Hive metastore client. In the Flink, > > we use > > > > the Catalog interface to connect to the Hive Metastore, which is > > allowed to > > > > communicate with different Hive Metastore[1]. Therefore, we allows the > > user > > > > to specify the path of the hive-site.xml as the endpoint parameters, > > which > > > > will used to create the default HiveCatalog in the Flink. Considering > > the > > > > different users may have different requirements to connect to different > > > > meta stores, they can use the DDL to register the HiveCatalog that > > > > satisfies their requirements. > > > > > > > > I understand it is difficult. You really want to support? > > > > > > > > ## Hive versions and setup > > > > > > > > I saw jark also commented, but FLIP does not seem to have been > > modified, > > > > how should the user setup, which jar to add, which hive metastore > > version > > > > to support? How to setup to support? > > > > > > > > Best, > > > > Jingsong > > > > > > > > On Tue, May 24, 2022 at 11:57 AM Shengkai Fang <fskm...@gmail.com> > > wrote: > > > > > > > > > Hi, all. > > > > > > > > > > Considering we start to vote for FLIP-91 for a while, I think we can > > > > > restart the discussion about the FLIP-223. > > > > > > > > > > I am glad that you can give some feedback about FLIP-223. > > > > > > > > > > Best, > > > > > Shengkai > > > > > > > > > > > > > > > Martijn Visser <mart...@ververica.com> 于2022年5月6日周五 19:10写道: > > > > > > > > > > > Hi Shengkai, > > > > > > > > > > > > Thanks for clarifying. > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Martijn > > > > > > > > > > > > On Fri, 6 May 2022 at 08:40, Shengkai Fang <fskm...@gmail.com> > > wrote: > > > > > > > > > > > > > Hi Martijn. > > > > > > > > > > > > > > > So this implementation would not rely in any way on Hive, only > > on > > > > > > Thrift? > > > > > > > > > > > > > > Yes. The dependency is light. We also can just copy the iface > > file > > > > > from > > > > > > > the Hive repo and maintain by ourselves. > > > > > > > > > > > > > > Best, > > > > > > > Shengkai > > > > > > > > > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年5月4日周三 21:44写道: > > > > > > > > > > > > > > > Hi Shengkai, > > > > > > > > > > > > > > > > > Actually we will only rely on the API in the Hive, which only > > > > > > contains > > > > > > > > the thrift file and the generated code > > > > > > > > > > > > > > > > So this implementation would not rely in any way on Hive, only > > on > > > > > > Thrift? > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > Martijn Visser > > > > > > > > https://twitter.com/MartijnVisser82 > > > > > > > > https://github.com/MartijnVisser > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 29 Apr 2022 at 05:16, Shengkai Fang <fskm...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi, Jark and Martijn > > > > > > > > > > > > > > > > > > Thanks for your feedback. > > > > > > > > > > > > > > > > > > > Kyuubi provides three ways to configure Hive metastore [1]. > > > > Could > > > > > > we > > > > > > > > > provide similar abilities? > > > > > > > > > > > > > > > > > > Yes. I have updated the FLIP about this and it takes some > > time to > > > > > > > figure > > > > > > > > > out how the jdbc driver works. I added the section about how > > to > > > > use > > > > > > the > > > > > > > > > hive JDBC to configure the session-level catalog. > > > > > > > > > > > > > > > > > > > I think we can improve the "HiveServer2 Compatibility" > > section. > > > > > > > > > > > > > > > > > > Yes. I have updated the FLIP and added more details about the > > > > > > > > > compatibility. > > > > > > > > > > > > > > > > > > > Prefer to first complete the discussion and vote on > > FLIP-91 > > > > then > > > > > > > > discuss > > > > > > > > > FLIP-223 > > > > > > > > > > > > > > > > > > Of course. We can wait until the discussion of the FLIP-91 > > > > > finishes. > > > > > > > > > > > > > > > > > > > Maintenance concerns about the hive > > > > > > > > > > > > > > > > > > Actually we will only rely on the API in the Hive, which only > > > > > > contains > > > > > > > > the > > > > > > > > > thrift file and the generated code[1]. I think it will not > > > > > influence > > > > > > us > > > > > > > > to > > > > > > > > > upgrade the java version. > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/hive/tree/master/service-rpc > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Shengkai > > > > > > > > > > > > > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年4月26日周二 > > 20:44写道: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > I'm not too familiar with Hive and HiveServer2, but I do > > have a > > > > > > > couple > > > > > > > > of > > > > > > > > > > questions/concerns: > > > > > > > > > > > > > > > > > > > > 1. What is the relationship between this FLIP and FLIP-91? > > My > > > > > > > > assumption > > > > > > > > > > would be that this FLIP (and therefore the HiveServer2) > > > > > > > implementation > > > > > > > > > > would need to be integrated in the REST Gateway, is that > > > > correct? > > > > > > If > > > > > > > > so, > > > > > > > > > I > > > > > > > > > > would prefer to first complete the discussion and vote on > > > > > FLIP-91, > > > > > > > else > > > > > > > > > > we'll have two moving FLIPs who have a direct relationship > > with > > > > > > each > > > > > > > > > other. > > > > > > > > > > > > > > > > > > > > 2. While I understand that Hive is important (in the > > Chinese > > > > > > > ecosystem, > > > > > > > > > not > > > > > > > > > > so much in Europe and the US), I still have maintenance > > > > concerns > > > > > on > > > > > > > > this > > > > > > > > > > topic. We know that the current Hive integration isn't > > exactly > > > > > > ideal > > > > > > > > and > > > > > > > > > > requires a lot of work to get in better shape. At the same > > > > time, > > > > > > Hive > > > > > > > > > still > > > > > > > > > > doesn't support Java 11 while we need (and should, given > > the > > > > > > premier > > > > > > > > > > support has ended already) to move away from Java 8. > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > > > Martijn Visser > > > > > > > > > > https://twitter.com/MartijnVisser82 > > > > > > > > > > https://github.com/MartijnVisser > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 25 Apr 2022 at 12:13, Jark Wu <imj...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thank Shengkai for driving this effort, > > > > > > > > > > > I think this is an essential addition to Flink Batch. > > > > > > > > > > > > > > > > > > > > > > I have some small suggestions: > > > > > > > > > > > 1) Kyuubi provides three ways to configure Hive metastore > > > > [1]. > > > > > > > Could > > > > > > > > we > > > > > > > > > > > provide similar abilities? > > > > > > > > > > > Especially with the JDBC Connection URL, users can visit > > > > > > different > > > > > > > > Hive > > > > > > > > > > > metastore server instances. > > > > > > > > > > > > > > > > > > > > > > 2) I think we can improve the "HiveServer2 Compatibility" > > > > > > section. > > > > > > > > > > > We need to figure out two compatibility matrices. One is > > SQL > > > > > > > Gateway > > > > > > > > > with > > > > > > > > > > > different versions of Hive metastore, > > > > > > > > > > > and the other is different versions of Hive client (e.g., > > > > Hive > > > > > > > JDBC) > > > > > > > > > with > > > > > > > > > > > SQL Gateway. We need to clarify > > > > > > > > > > > what metastore and client versions we support and how > > users > > > > > > > configure > > > > > > > > > the > > > > > > > > > > > versions. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://kyuubi.apache.org/docs/r1.3.1-incubating/deployment/hive_metastore.html#activate-configurations > > > > > > > > > > > > > > > > > > > > > > On Sun, 24 Apr 2022 at 15:02, Shengkai Fang < > > > > fskm...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi, Jiang. > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your feedback! > > > > > > > > > > > > > > > > > > > > > > > > > Integrating the Hive ecosystem should not require > > > > changing > > > > > > the > > > > > > > > > > service > > > > > > > > > > > > interface > > > > > > > > > > > > > > > > > > > > > > > > I move the API change to the FLIP-91. But I think it's > > > > > possible > > > > > > > we > > > > > > > > > add > > > > > > > > > > > more > > > > > > > > > > > > interfaces to intergrate the new endpoints in the > > future > > > > > > because > > > > > > > > > every > > > > > > > > > > > > endpoints's functionality is different. For example, > > the > > > > REST > > > > > > > > > endpoint > > > > > > > > > > > > doen't support to fetch operation-level logs but the > > > > > > hiveserver2 > > > > > > > > > > endpoint > > > > > > > > > > > > supports. In this case, we need to modify the shared > > > > > > > GatewayService > > > > > > > > > to > > > > > > > > > > > > support the functionality exposed by the new endpint. > > > > > > > > > > > > > > > > > > > > > > > > > How to support different Hive versions? > > > > > > > > > > > > > > > > > > > > > > > > Do you means to support the different HiveServer2 > > version? > > > > > The > > > > > > > > > > > HiveServer2 > > > > > > > > > > > > uses the version to guarantee the compatibility. > > During the > > > > > > > > > > openSession, > > > > > > > > > > > > the client and server will determine the protocol > > > > > > > > > > version(minimun(client > > > > > > > > > > > > version, hiveendpoint version)). After that the client > > and > > > > > the > > > > > > > > server > > > > > > > > > > > uses > > > > > > > > > > > > the determined version to communicate. In the > > HiveServer2 > > > > > > > endpoint, > > > > > > > > > it > > > > > > > > > > > > determines how the endpoint deserialize the results > > and the > > > > > > > result > > > > > > > > > > > schema. > > > > > > > > > > > > I add a section about HiveServer2 compatiblity. > > > > > > > > > > > > > > > > > > > > > > > > > Could you please fully provide its definition > > including > > > > > input > > > > > > > > > > > parameters > > > > > > > > > > > > and the corresponding return value schema? > > > > > > > > > > > > > > > > > > > > > > > > Because we implements the interface exposed by the > > Hive. > > > > So I > > > > > > add > > > > > > > > the > > > > > > > > > > > file > > > > > > > > > > > > link to the HiveServer2 interfaces[1], which contains > > all > > > > > input > > > > > > > > > > > parameters > > > > > > > > > > > > and the results. Considering the file doesn't contain > > the > > > > > > output > > > > > > > > for > > > > > > > > > > the > > > > > > > > > > > > Operation, I add the output schema for all the > > supported > > > > > > > Operation > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > FLIP, which is not covered in the link. Hope these can > > > > > address > > > > > > > your > > > > > > > > > > > > question. > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Shengkai > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/hive/blob/branch-2.3/service-rpc/if/TCLIService.thrift#L1227 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Nicholas Jiang <nicholasji...@apache.org> > > 于2022年4月22日周五 > > > > > > 16:43写道: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Shengkai. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for driving the proposal of HiveServer2 > > Endpoint > > > > > > > support. > > > > > > > > > For > > > > > > > > > > > the > > > > > > > > > > > > > "GatewayService API Change", I don't think the > > motivation > > > > > for > > > > > > > > > > > supporting > > > > > > > > > > > > > HiveServer2 endpoint need to change the > > GatewayService > > > > API, > > > > > > in > > > > > > > > > other > > > > > > > > > > > > words, > > > > > > > > > > > > > integrating the Hive ecosystem should not require > > > > changing > > > > > > the > > > > > > > > > > service > > > > > > > > > > > > > interface. If you confirm to change GatewayService > > > > > interface, > > > > > > > > IMO, > > > > > > > > > > the > > > > > > > > > > > > > proposal could be discussed in FLIP-91 because the > > public > > > > > > > > > interfaces > > > > > > > > > > > are > > > > > > > > > > > > > defined in FLIP-91. > > > > > > > > > > > > > > > > > > > > > > > > > > In addtion, how to support different Hive versions > > and > > > > how > > > > > to > > > > > > > > > > guarantee > > > > > > > > > > > > > compatibility is not mentioned in the design. What's > > the > > > > > > > behavior > > > > > > > > > of > > > > > > > > > > > the > > > > > > > > > > > > > compatibility? > > > > > > > > > > > > > > > > > > > > > > > > > > Finally, for the public interfaces, could you please > > > > fully > > > > > > > > provide > > > > > > > > > > its > > > > > > > > > > > > > definition including input parameters and the > > > > corresponding > > > > > > > > return > > > > > > > > > > > value > > > > > > > > > > > > > schema? > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > Nicholas Jiang > > > > > > > > > > > > > > > > > > > > > > > > > > On 2022/04/21 06:45:13 Shengkai Fang wrote: > > > > > > > > > > > > > > Hi, Flink developers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I want to start a discussion about the FLIP-223: > > > > Support > > > > > > > > > > HiveServer2 > > > > > > > > > > > > > > Endpoint[1]. The Endpoint will implement the thrift > > > > > > interface > > > > > > > > > > exposed > > > > > > > > > > > > by > > > > > > > > > > > > > > the HiveServer2, and users' BI, CLI and other tools > > > > based > > > > > > on > > > > > > > > the > > > > > > > > > > > > > > HiveServer2 can also be seamlessly migrated to the > > > > Flink > > > > > > SQL > > > > > > > > > > Gateway. > > > > > > > > > > > > > After > > > > > > > > > > > > > > the FLIP finishes, the users can have almost the > > same > > > > > > > > experience > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > > > Flink SQL Gateway with the HiveServer2 endpoint as > > in > > > > the > > > > > > > > > > > HiveServer2. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am glad that you can give some feedback about > > > > FLIP-223. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Shengkai > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223+Support+HiveServer2+Endpoint > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >