Thanks, Shaoxuan! I've sent a Chinese version to user-zh at the same time yesterday.
>From feedbacks we received so far, supporting multiple older hive versions is definitely one of our focuses next. *More feedbacks are welcome from our community!* On Tue, Mar 19, 2019 at 8:44 PM Shaoxuan Wang <wshaox...@gmail.com> wrote: > Hi Bowen, > Thanks for driving this. I am CCing this email/survey to user-zh@ > flink.apache.org as well. > I heard there are lots of interests on Flink-Hive from the field. One of > the biggest requests the hive users are raised is "the support of > out-of-date hive version". A large amount of users are still working on the > cluster with CDH/HDP installed with old hive version, say 1.2.1/2.1.1. We > need ensure the support of these Hive version when planning the work on > Flink-Hive integration. > > *@all. "We want to get your feedbacks on Flink-Hive integration." * > > Regards, > Shaoxuan > > On Wed, Mar 20, 2019 at 7:16 AM Bowen Li <bowenl...@gmail.com> wrote: > >> Hi Flink users and devs, >> >> We want to get your feedbacks on integrating Flink with Hive. >> >> Background: In Flink Forward in Beijing last December, the community >> announced to initiate efforts on integrating Flink and Hive. On Feb 21 >> Seattle >> Flink Meetup <https://www.meetup.com/seattle-flink/events/258723322/>, >> We presented Integrating Flink with Hive >> <https://www.slideshare.net/BowenLi9/integrating-flink-with-hive-xuefu-zhang-and-bowen-li-seattle-flink-meetup-feb-2019> >> with >> a live demo to local community and got great response. As of mid March now, >> we have internally finished building Flink's brand-new catalog >> infrastructure, metadata integration with Hive, and most common cases of >> Flink reading/writing against Hive, and will start to submit more design >> docs/FLIP and contribute code back to community. The reason for doing it >> internally first and then in community is to ensure our proposed solutions >> are fully validated and tested, gain hands-on experience and not miss >> anything in design. You are very welcome to join this effort, from >> design/code review, to development and testing. >> >> *The most important thing we believe you, our Flink users/devs, can help >> RIGHT NOW is to share your Hive use cases and give us feedbacks for this >> project. As we start to go deeper on specific areas of integration, you >> feedbacks and suggestions will help us to refine our backlogs and >> prioritize our work, and you can get the features you want sooner! *Just >> for example, if most users is mainly only reading Hive data, then we can >> prioritize tuning read performance over implementing write capability. >> A quick review of what we've finished building internally and is ready to >> contribute back to community: >> >> - Flink/Hive Metadata Integration >> - Unified, pluggable catalog infra that manages meta-objects, >> including catalogs, databases, tables, views, functions, partitions, >> table/partition stats >> - Three catalog impls - A in-memory catalog, HiveCatalog for >> embracing Hive ecosystem, GenericHiveMetastoreCatalog for persisting >> Flink's streaming/batch metadata in Hive metastore >> - Hierarchical metadata reference as >> <catalog_name>.<database_name>.<metaobject_name> in SQL and Table API >> - Unified function catalog based on new catalog infra, also >> support Hive simple UDF >> - Flink/Hive Data Integration >> - Hive data connector that reads partitioned/non-partitioned Hive >> tables, and supports partition pruning, both Hive simple and complex >> data >> types, and basic write >> - More powerful SQL Client fully integrated with the above features >> and more Hive-compatible SQL syntax for better end-to-end SQL experience >> >> *Given above info, we want to learn from you on: How do you use Hive >> currently? How can we solve your pain points? What features do you expect >> from Flink-Hive integration? Those can be details like:* >> >> - *Which Hive version are you using? Do you plan to upgrade Hive?* >> - *Are you planning to switch Hive engine? What timeline are you >> looking at? Until what capabilities Flink has will you consider using >> Flink >> with Hive?* >> - *What's your motivation to try Flink-Hive? Maintain only one data >> processing system across your teams for simplicity and maintainability? >> Better performance of Flink over Hive itself?* >> - *What are your Hive use cases? How large is your Hive data size? Do >> you mainly do reading, or both reading and writing?* >> - *How many Hive user defined functions do you have? Are they mostly >> UDF, GenericUDF, or UDTF, or UDAF?* >> - any questions or suggestions you have? or as simple as how you feel >> about the project >> >> Again, your input will be really valuable to us, and we hope, with all of >> us working together, the project can benefits our end users. Please feel >> free to either reply to this thread or just to me. I'm also working on >> creating a questionnaire to better gather your feedbacks, watch for the >> maillist in the next couple days. >> >> Thanks, >> Bowen >> >> >> >> >>