On 2017-06-30 07:56 (-0700), Alan Gates <alanfga...@gmail.com> wrote: > A few of us have been talking and come to the conclussion that it would be > a good thing to split out the Hive metastore into its own Apache project. > Below and in the linked wiki page we explain what we see as the advantages > to this and how we would go about it. > > Hiveâs metastore has long been used by other projects in the Hadoop > ecosystem to store and access metadata. Apache Impala, Apache Spark, > Apache Drill, Presto, and other systems all use Hiveâs metastore. Some, > like Impala and Presto can use it as their own metadata system with the > rest of Hive not present. > > This sharing is excellent for the ecosystem. Together with HDFS it allows > users to use the tool of their choice while still accessing the same shared > data. But having this shared metadata inside the Hive project limits the > ability of other projects to contribute to the metastore. It also makes it > harder for new systems that have similar but not identical metadata > requirements (for example, stream processing systems on top of Apache > Kafka) to use Hiveâs metastore. This difficulty for other systems comes > out in two ways. One, it is hard for non-Hive community members to > participate in the project. Second, it adds operational cost since users > are forced to deploy all of the Hive jars just to get the metastore to work. > > Therefore we propose to split Hiveâs metastore out into a separate Apache > project. This new project will continue to support the same Thrift API as > the current metastore. It will continue to focus on being a high > performance, fault tolerant, large scale, operational metastore for SQL > engines and other systems that want to store schema information about their > data. > > By making it a separate project we will enable other projects to join us in > innovating on the metastore. It will simplify operations for non-Hive > users that want to use the metastore as they will no longer need to install > Hive just to get the metastore. And it will attract new projects that > might otherwise feel the need to solve their metadata problems on their own. > > Any Hive PMC member or committer will be welcome to join the new project at > the same level. We propose this project go straight to a top level > project. Given that the initial PMC will be formed from experienced Hive > PMC members we do not believe incubation will be necessary. (Note that the > Apache board will need to approve this.) > > Obviously there a many details involved in a proposal like this. Rather > than make this a ten page email we have filled out many of the details in a > wiki page: > https://cwiki.apache.org/confluence/display/Hive/Metastore+TLP+Proposal > > Yongzhi Chen > Vihang Karajgaonkar > Sergio Pena > Sahil Takiar > Aihua Xu > Gunther Hagleitner > Thejas Nair > Alan Gates >
+1 (from Apache Impala's (incubating) perspective) Dimitris