Re: [DISCUSS] Separating out the metastore as its own TLP

Dain Sundstrom Mon, 03 Jul 2017 10:18:05 -0700

+1

I work on Presto and I think this the right direction for our users.  We have 
several users running Presto without Hive and anything we can do to help 
simplify the Metastore experience would be a good help.


When I read proposals like this, one thing I like to see is a vision (scope) 
for the project.  In this case, I’d like to understand if the plan is to limit 
the scope of the system to what Hive can support.  For example, the system will 
clearly support schemas (databases) with tables and views as defined by Hive, 
but will there be support for additional types like a Presto view which is 
incompatible with a Hive views due to the language differences?  Currently, in 
Presto we create a Hive view to reserve a spot in the "tables namespace”, and 
then we put our view data in a table properties.  I would like to formalize 
this kind of system, so if a Hive user queries a Presto view, they get a proper 
error message. I have similar concerns about data types, compression, and data 
organization (e.g., different bucketing strategies). 

Another aspect of this is what is the vision for the specification of the 
Metastore.  Is the vision to have a very open end-user extensible design (e.g., 
just a name and a bag of properties), or is the vision to have a project 
specified common set properties with “rules” for proper extension?

I would also be very interested in documentation for the Metastore APIs (and 
can help). We currently reverse engineer proper metastore interaction by 
reading the Hive code, and writing a lot of experimental programs, and I would 
really just like to know the "right way”.  Also, we end up missing out on new 
features in the Metastore due to the work required to understand how they work.

-dain

Re: [DISCUSS] Separating out the metastore as its own TLP

Reply via email to