+1 I work on Presto and I think this the right direction for our users. We have several users running Presto without Hive and anything we can do to help simplify the Metastore experience would be a good help.
When I read proposals like this, one thing I like to see is a vision (scope) for the project. In this case, I’d like to understand if the plan is to limit the scope of the system to what Hive can support. For example, the system will clearly support schemas (databases) with tables and views as defined by Hive, but will there be support for additional types like a Presto view which is incompatible with a Hive views due to the language differences? Currently, in Presto we create a Hive view to reserve a spot in the "tables namespace”, and then we put our view data in a table properties. I would like to formalize this kind of system, so if a Hive user queries a Presto view, they get a proper error message. I have similar concerns about data types, compression, and data organization (e.g., different bucketing strategies). Another aspect of this is what is the vision for the specification of the Metastore. Is the vision to have a very open end-user extensible design (e.g., just a name and a bag of properties), or is the vision to have a project specified common set properties with “rules” for proper extension? I would also be very interested in documentation for the Metastore APIs (and can help). We currently reverse engineer proper metastore interaction by reading the Hive code, and writing a lot of experimental programs, and I would really just like to know the "right way”. Also, we end up missing out on new features in the Metastore due to the work required to understand how they work. -dain