+1
It will help to rely on that code in the process of implementing Drill
Metastore, DRILL-6552.

@Gautam Please address all current commits and rebase onto latest master,
then Vova and me will do additional review for it.
Just for clarification, am I right, the changes state is the same as in
last comment in DRILL-1328 [1]
(will not include histograms and will cause some regressions for TPC-H and
TPC-DS benchmarks)?

[1]
https://issues.apache.org/jira/browse/DRILL-1328?focusedCommentId=16061374&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16061374


Kind regards
Vitalii


On Tue, Nov 6, 2018 at 1:47 AM Parth Chandra <par...@apache.org> wrote:

> +1
> I'd say go for it.
> If the option to use enhanced stats an be turned on per session, then users
> can experiment and choose to turn it on for queries where they do not
> experience performance degradation.
>
>
> On Fri, Nov 2, 2018 at 3:25 PM Gautam Parai <gpa...@mapr.com> wrote:
>
> > Hi all,
> >
> > I had an initial implementation for statistics support for Drill
> > [DRILL-1328] <https://issues.apache.org/jira/browse/DRILL-1328>. This
> JIRA
> > has links to the design spec as well as the PR. Unfortunately, because of
> > some regressions on performance benchmarks (TPCH/TPCDS) we decided to
> > temporarily shelve the implementation. I would like to resolve the
> pending
> > issues and get the changes in.
> >
> > Hopefully, it will be okay to merge it in as an experimental feature
> since
> > in order to resolve these issues we may need to change the existing join
> > ordering algorithm in Drill, add support for Histograms and a few other
> > planning related issues. Moreover, the community is adding a meta-store
> for
> > Drill [DRILL-6552] <https://issues.apache.org/jira/browse/DRILL-6552>.
> > Statistics should also be able to leverage the brand new meta-store
> instead
> > of/in addition to having a custom store implementation.
> >
> > My plan is to address the most critical review comments and get the
> initial
> > version in as an experimental feature. Some other good-to-have aspects
> like
> > handling schema changes during the statistics collection process maybe
> > deferred to the next iteration. Subsequently, I will improve these
> > good-to-have features and additional performance improvements. It would
> be
> > great to get the initial implementation in to avoid the rebase issues and
> > allow other community members to use and contribute to the feature.
> >
> > Please take a look at the design doc and the PR and provide suggestions
> and
> > feedback on the JIRA. Also I will try to present the current state of
> > statistics and the feature in one of the bi-weekly Drill Community
> > Hangouts.
> >
> > Thanks,
> > Gautam
> >
>

Reply via email to