Hi Vitalii:

  Glad to hear that you are also looking at this part. Let's  keep
discussion under that Jira.

On Fri, Jun 29, 2018 at 1:27 AM Vitalii Diravka <[email protected]>
wrote:

> Hi Weijie,
>
> Thanks for bringing this topic up!
>
> Basically you are right, Hive Metastore is one the best candidates for
> storing Driil's metadata.
> Also it will be good to make an abstraction, which will allow to implement
> and use other kind of tools for Metastore.
> The question of Metastore performance can be important especially for light
> Drill tables.
>
> Currently Vova and I are working on the proposal for metastore.
> I have created Jira DRILL-6552 [1] where all the related discussions can be
> held.
>
> [1] https://issues.apache.org/jira/browse/DRILL-6552
>
> Kind regards
> Vitalii
>
>
> On Thu, Jun 28, 2018 at 6:49 PM Arina Yelchiyeva <
> [email protected]>
> wrote:
>
> > Hi,
> >
> > Vitalii and Vova is also looking at this part, you might want to sync up
> > with them. Or even better, we can create Jira for this and held all
> > discussions there.
> > Vitalii, what do you think?
> >
> > Kind regards,
> > Arina
> >
> > On Thu, Jun 28, 2018 at 6:46 PM weijie tong <[email protected]>
> > wrote:
> >
> > > HI all:
> > >
> > >     As @aman ever noticed me about the roadmap of DRILL-2.0 ,which
> > includes
> > > the description of  the metadata design (
> > >
> > >
> >
> https://lists.apache.org/thread.html/74cf48dd78d323535dc942c969e72008884e51f8715f4a20f6f8fb66@%3Cdev.drill.apache.org%3E
> > > )
> > > , I am interested in taking the role to implement the metadata part.
> > > Here I fire this discussion thread to know your idea about this
> problem.
> > >
> > >     I have investigated some open source project about the metadata
> ,such
> > > as Hive Metastore (
> > >
> https://cwiki.apache.org/confluence/display/Hive/Design#Design-Metastore
> > )
> > > ,Netflix metacat, Apache Atlas,LinkedIn WhereHows(
> > > https://github.com/linkedin/WhereHows)  ;  Except Hive Metastore,
> other
> > > projects have an high abstract definition to the actual physical
> metadata
> > > which will benefit to extend to add new metadata property. Hive
> > Metastore‘s
> > > design is to the physical metadata , also with thrift interface to
> > > different languages, but depend on the relational database  not good to
> > > scale and performance.   To my opinion , I would prefer Hive Metastore
> as
> > > our design template or just reuse it, as we don't need to do a rich
> > > metadata management system. Maybe we should change the backend database
> > to
> > > a high query performance kv store like Hbase.
> > >
> > >    Besides the metadata interface design and the backend storage
> chosen,
> > we
> > > should also provide the random query ability . So users can calculate
> the
> > > statistics like NDV to store in the metadata. Btw, maybe we can go
> > further
> > > to take in the Verdictdb  (https://github.com/mozafari/verdictdb) to
> > > provide more richful approximate query processing .
> > >
> >
>

Reply via email to