I would like to bring this discussion up again.

We created our own metadata provider, and set it to RelMetadataQuery by calling
RelMetadataQuery.THREAD_PROVIDERS.set()
If we just use all the types of metadata provided by Calcite, it should be 
fine. But given that we have created new types of metadata, and we want to get 
them during rule transformation, we want to have the rule call return our own 
RelMetadataQuery. In order to achieve that, we sub-classed RelMetadataQuery, 
and sub-classed RelOptCluster to make metadata query overridable, and return 
the instance of subclass of RelMetadataQuery when the mq is null.  Indeed, we 
can achieve that goal by just call rel.metadata(), but we want a efficient and 
convenient way to do that. That is why we want RelMetadataQuery in 
RelOptCluster customizable. 
Is it a good practice? If not, any other better approach to suggest?
Thanks.
Haisheng

------------------------------------------------------------------
发件人:Stamatis Zampetakis<zabe...@gmail.com>
日 期:2019年06月15日 15:47:58
收件人:<dev@calcite.apache.org>
主 题:Re: [DISCUSSION] Extension of Metadata Query

I had a look on CALCITE-1147 (which added support for custom providers and
metadata) and based on the added tests [1], it seems that the main
alternative to RelMetadataQuery is indeed through the MetadataFactory. One
other alternative I can think of, is to pass a custom RelMetadataQuery (or
a query factory) through the context of the planner [2].

Given that we have already methods to obtain a RelMetadataQuery in
RelOptRuleCall and RelOptCluster it may make sense to allow the type of
query to be pluggable. However, I have not really used this mechanism in
the past so I cannot be sure what's the best way to proceed.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
[2]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256

On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen <yuzhao....@gmail.com> wrote:

> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> which I believe is the most common routine to query metadata during
> planning, I’m a little confused for your saying
>
> >The methods in RelMetadataQuery are for convenience only.
>
> If these methods in RelMetadataQuery are only for convenience, what is the
> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> ? (Users would never expect to use this because it’s based on Java
> Reflection and has performance problem)
>
> [1]
> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
>
> Best,
> Danny Chan
> 在 2019年6月12日 +0800 AM6:27,Julian Hyde <jh...@apache.org>,写道:
> > The goal is that it should be possible to add a new kind of metadata.
> The methods in RelMetadataQuery are for convenience only. So you should be
> able to use your new kind of metadata, and existing ones, without modifying
> RelMetadataQuery.
> >
> > If that is not possible, it’s a bug, and you should log a JIRA case.
> >
> > Given how the methods re-generate handlers (by catching exceptions and
> modifying mutable private members) I can see that new kinds of metadata
> might be difficult to add.
> >
> > Julian
> >
> >
> > > On Jun 9, 2019, at 7:54 PM, Yuzhao Chen <yuzhao....@gmail.com> wrote:
> > >
> > > Thanks, Stamatis
> > >
> > > You are right, this discussion came up due to CALCITE-2885, it is not
> about the performance problem, it is the extension of RelMetadataQuery,
> because we add all kinds of top interfaces in RelMetadataQuery, e.g. the
> row count query [1].
> > >
> > > When we add a new RelMetadataProvider, a corresponding
> interface/method should be added into RelMetadataQuery, but for current
> RelOpeCluster impl, we could not do that(with a always default instance)
> > >
> > > As for the methods in RelMetadataProvider, i only saw the usage of
> #handlers in RelMetadataQuery, and saw the reference you pase, it seems not
> that relevant with this topic. What do you think ?
> > >
> > > [1]
> https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis <zabe...@gmail.com>,写道:
> > > > Thanks for bringing this up Danny.
> > > >
> > > > I guess the discussion came up due to CALCITE-2885 [1]. Looking back
> into
> > > > it, I am not sure that the intention there is to make the
> RelMetadataQuery
> > > > pluggable. We could possibly solve the performance problem without
> > > > extending the RelMetadataQuery. We have to look again into this case.
> > > >
> > > > For more details regarding the existence of the two methods in
> > > > RelMetadataProvider have a look in CALCITE-604 [2]. More general for
> the
> > > > design of RelMetadataProvider you may find useful the description in
> [3].
> > > >
> > > > Best,
> > > > Stamatis
> > > >
> > > > [1] https://issues.apache.org/jira/browse/CALCITE-2885
> > > > [2] https://issues.apache.org/jira/browse/CALCITE-604
> > > > [3]
> > > >
> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
> > > >
> > > > On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde <jh...@apache.org> wrote:
> > > >
> > > > > > 1. Why we have 2 methods in RelMetadataProvider?
> > > > >
> > > > > The metadata system is complicated. We need to allow multiple
> handlers
> > > > > for any given call. So, making a metadata call involves multiple
> > > > > dispatch [1] based on the metadata method being called, the type of
> > > > > RelNode, and the handlers that are registered. Also it needs to
> cache
> > > > > results, and detect cycles. And the dispatch needs to be
> efficient, so
> > > > > we generate janino code to do the dispatch, and re-generate when
> new
> > > > > handlers or sub-classes of RelNode are added.
> > > > >
> > > > > I forget details, the two methods are basically required to allow
> us
> > > > > to generate code to do the dispatch.
> > > > >
> > > > > > 2. We should make the RelMetadataQuery in RelOptCluster
> pluggable.
> > > > >
> > > > > I disagree. RelMetadataQuery is not an extension point. Its sole
> > > > > purpose is to to keep state (the cache and cycle-checking).
> > > > > RelMetadataProvider is the extension point. (By analogy, if you are
> > > > > un-parsing an AST, you let each AST sub-class handle unparsing
> itself,
> > > > > but the unparsed text goes to a simple StringBuilder.
> RelMetadataQuery
> > > > > is in the role of the StringBuilder. In a complex system, it is
> nice
> > > > > to keep some of the components simple, or at least keep them to
> > > > > prescribed roles.)
> > > > >
> > > > > Julian
> > > > >
> > > > > [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> > > > >
> > > > > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen <yuzhao....@gmail.com>
> wrote:
> > > > > >
> > > > > > Currently we provide answer to metadata query through
> > > > > RelMetadataProvider [1], there are some sub-classes of it:
> > > > > >
> > > > > > RelMetadataProvider
> > > > > > |
> > > > > > |- VolcanoRelMetadataProvider
> > > > > > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > > > > > |- HepRelMetadataProvider
> > > > > > |- CachingRelMetadataProvider
> > > > > > |- ReflectiveRelMetadataProvider
> > > > > > |- JaninoRelMetadataProvider
> > > > > >
> > > > > > The RelMetadataProvider has two methods: #apply and #handlers,
> the
> > > > > #apply method seems a programming interface and there is a demo
> code how we
> > > > > can use it:
> > > > > >
> > > > > > RelMetadataProvider provider;
> > > > > > LogicalFilter filter;
> > > > > > RexNode predicate;
> > > > > > Function<RelNode, Metadata> function =
> > > > > > provider.apply(LogicalFilter.class, Selectivity.class};
> > > > > > Selectivity selectivity = function.apply(filter);
> > > > > > Double d = selectivity.selectivity(predicate);
> > > > > >
> > > > > > But let's see our RelOptCluster's member variables[2], there are
> > > > > MetadataFactory and RelMetadataQuery which all can be used to
> query the
> > > > > metadata, for MetadataFactory, there is a default impl named
> > > > > MetadataFactoryImpl which will invoke RelMetadataProvider#apply
> internally,
> > > > > for RelMetadataQuery, it will invoke RelMetadataProvider#handlers
> (finally
> > > > > composed and codeden by JaninoRelMetadataProvider).
> > > > > >
> > > > > > In our planning phrase, we can invoke
> RelOptRuleCall#getMetadataQuery to
> > > > > get the MQ and query the metadata.
> > > > > >
> > > > > > For extension of metadata handlers, we can set our customized
> > > > > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery,
> we have
> > > > > no way to extend it now, because the RelOptCluster always has a
> singleton
> > > > > instance [4] which is only the default implementation.
> > > > > >
> > > > > >
> > > > > > My question is as follows:
> > > > > >
> > > > > > 1. Why we have 2 methods in RelMetadataProvider, and why we need
> the
> > > > > MetadataFactory and RelMetadataProvider#apply ? It seems that it's
> function
> > > > > is already been overriden by RelMetadataQuery(The difference is
> that
> > > > > MetadataFactory use Reflection and RelMetadataQuery use gened
> bytes code).
> > > > > > 2. We should make the RelMetadataQuery in RelOptCluster
> pluggable.
> > > > > >
> > > > > >
> > > > > > [1]
> > > > >
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> > > > > > [2]
> > > > >
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
> > > > > > [3]
> > > > >
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
> > > > > > [4]
> > > > >
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Danny Chan
> > > > >
> >
>

Reply via email to