Thanks Rui for the proposal, I think this FLIP is required by many users, and it is very good to traditional Hive users. I have some confusion:
# Version Which Hive version do you want to choose? Maybe, Hive 3.X and Hive 2.X have some differences? # Hive Codes Can you evaluate how much code we need to copy to our flink-hive-connector? Do we need to change them? We need to maintain them anyway. # Functions About Hive functions, I don't think it is a limitation, we are using HiveModule to be compatible with Hive, right? So it is a solution instead of a limitation. # Keywords Do you think there will be a keyword problem? Or can we be 100% compatible with Hive? On the whole, the FLIP looks very good and I'm looking forward to it. Best, Jingsong On Fri, Dec 11, 2020 at 11:35 AM Zhijiang <[email protected]> wrote: > Thanks for the further info and explanations! I have no other concerns. > > Best, > Zhijiang > > > ------------------------------------------------------------------ > From:Rui Li <[email protected]> > Send Time:2020年12月10日(星期四) 20:35 > To:dev <[email protected]>; Zhijiang <[email protected]> > Subject:Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility > > Hi Zhijiang, > > Glad to know you're interested in this FLIP. I wouldn't claim 100% > compatibility with this FLIP. That's because Flink doesn't have the > functionalities to support all Hive's features. To list a few examples: > > 1. Hive allows users to process data with shell scripts -- very similar > to UDFs [1] > 2. Users can compile inline Groovy UDFs and use them in queries [2] > 3. Users can dynamically add/delete jars, or even execute arbitrary > shell command [3] > > These features cannot be supported merely by a parser/planner, and it's > open to discussion whether Flink even should support them at all. > > So the ultimate goal of this FLIP is to provide Hive syntax compatibility > to features that are already available in Flink, which I believe will cover > most common use cases. > > [1] > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform#LanguageManualTransform-TRANSFORMExamples > [2] > > https://community.cloudera.com/t5/Community-Articles/Apache-Hive-Groovy-UDF-examples/ta-p/245060 > [3] > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-HiveInteractiveShellCommands > > On Thu, Dec 10, 2020 at 6:11 PM Zhijiang <[email protected] > .invalid> > wrote: > > > Thanks for launching the discussion and the FLIP, Rui! > > > > It is really nice to see our continuous efforts for compatibility with > > Hive and benefiting users in this area. > > I am only curious that are there any other compatible limitations for > Hive > > users after this FLIP? Or can I say that the Hive compatibility is > > completely resolved after this FLIP? > > I am interested in the ultimate goal in this area. Maybe it is out of > this > > FLIP scope, but still wish some insights from you if possible. :) > > > > Best, > > Zhijiang > > > > > > ------------------------------------------------------------------ > > From:Rui Li <[email protected]> > > Send Time:2020年12月10日(星期四) 16:46 > > To:dev <[email protected]> > > Subject:Re: [DISCUSS] FLIP-152: Hive Query Syntax Compatibility > > > > Thanks Kurt for your inputs! > > > > I agree we should extend Hive code to support non-Hive tables. I have > > updated the wiki page to remove the limitations you mentioned, and add > > typical use cases in the "Motivation" section. > > > > Regarding comment #b, the interface is defined in > flink-table-planner-blink > > and only used by the blink planner. So I think "BlinkParserFactory" is a > > better name, WDYT? > > > > On Mon, Dec 7, 2020 at 12:28 PM Kurt Young <[email protected]> wrote: > > > > > Thanks Rui for starting this discussion. > > > > > > I can see the benefit that we improve hive compatibility further, as > > quite > > > some users are asking for this > > > feature in mailing lists [1][2][3] and some online chatting tools such > as > > > DingTalk. > > > > > > I have 3 comments regarding to the design doc: > > > > > > a) Could you add a section to describe the typical use case you want to > > > support after this feature is introduced? > > > In that way, users can also have an impression how to use this feature > > and > > > what the behavior and outcome will be. > > > > > > b) Regarding the naming: "BlinkParserFactory", I suggest renaming it to > > > "FlinkParserFactory". > > > > > > c) About the two limitations you mentioned: > > > 1. Only works with Hive tables and the current catalog needs to be > a > > > HiveCatalog. > > > 2. Queries cannot involve tables/views from multiple catalogs. > > > I assume this is because hive parser and analyzer doesn't support > > > referring to a name with "x.y.z" fashion? Since > > > we can control all the behaviors by leveraging the codes hive currently > > > use. Is it possible that we can remove such > > > limitations? The reason is I'm not sure if users can make the whole > story > > > work purely depending on hive catalog (that's > > > the reason why I gave comment #a). If multiple catalogs are involved, > > with > > > this limitation I don't think any meaningful > > > pipeline could be built. For example, users want to stream data from > > Kafka > > > to Hive, fully use hive's dialect including > > > query part. The kafka table could be a temporary table or saved in > > default > > > memory catalog. > > > > > > > > > [1] http://apache-flink.147419.n8.nabble.com/calcite-td9059.html#a9118 > > > [2] > > http://apache-flink.147419.n8.nabble.com/hive-sql-flink-11-td9116.html > > > [3] > > > > > > > > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-to-in-Flink-to-support-below-HIVE-SQL-td34162.html > > > > > > Best, > > > Kurt > > > > > > > > > On Wed, Dec 2, 2020 at 10:02 PM Rui Li <[email protected]> wrote: > > > > > > > Hi guys, > > > > > > > > I'd like to start a discussion about providing HiveQL compatibility > for > > > > users connecting to a hive warehouse. FLIP-123 has already covered > most > > > > DDLs. So now it's time to complement the other big missing part -- > > > queries. > > > > With FLIP-152, the hive dialect covers more scenarios and makes it > even > > > > easier for users to migrate to Flink. More details are in the FLIP > wiki > > > > page [1]. Looking forward to your feedback! > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility > > > > > > > > -- > > > > Best regards! > > > > Rui Li > > > > > > > > > > > > > -- > > Best regards! > > Rui Li > > > > > > -- > Best regards! > Rui Li > > -- Best, Jingsong Lee
