That's a good question.
Yes. And the method getMetadataQuery from RelOptRuleCall also retrieves 
MetadataQuery from RelOptCluster.
Any other better places?

- Haisheng

------------------------------------------------------------------
发件人:Julian Hyde<jh...@apache.org>
日 期:2019年10月16日 11:02:33
收件人:dev<dev@calcite.apache.org>
主 题:Re: [DISCUSSION] Extension of Metadata Query

Yes, but why RelOptCluster? Besides the fact that there happened to be a method 
there.


> On Oct 15, 2019, at 7:30 PM, Haisheng Yuan <h.y...@alibaba-inc.com> wrote:
> 
> > Is there a good rationale for attaching one to RelOptCluster
> Customized RelMetadataQuery with code generated meta handler for customized 
> metadata, also can provide convenient way to get metadata.
> 
> - Haisheng
> 
> ------------------------------------------------------------------
> 发件人:Julian Hyde<jh...@apache.org>
> 日 期:2019年10月16日 06:52:39
> 收件人:<dev@calcite.apache.org>
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> Having reviewed https://issues.apache.org/jira/browse/CALCITE-3403 
> <https://issues.apache.org/jira/browse/CALCITE-3403> and 
> https://issues.apache.org/jira/browse/CALCITE-2018 
> <https://issues.apache.org/jira/browse/CALCITE-2018> today I am worried about 
> the direction RelMetadataQuery is taking. Is there a good rationale for 
> attaching one to RelOptCluster? (Besides “it seems to work”?)
> 
> The natural lifespan of a RelMetadataQuery is a RelOptCall (because the 
> structure of the tree is constant for the duration of a call).
> 
> As for sub-classing RelMetadataQuery, I really don’t know. It’s probably 
> harmless, possibly disastrous, and it takes a lot of effort to determine 
> which. We have to be really cautious with RelMetadataQuery because it is one 
> of the few components in Calcite that mutate.
> 
> Julian
> 
> 
> 
> 
> > On Oct 15, 2019, at 1:42 PM, Haisheng Yuan <h.y...@alibaba-inc.com> wrote:
> > 
> > I would like to bring this discussion up again.
> > 
> > We created our own metadata provider, and set it to RelMetadataQuery by 
> > calling
> > RelMetadataQuery.THREAD_PROVIDERS.set()
> > If we just use all the types of metadata provided by Calcite, it should be 
> > fine. But given that we have created new types of metadata, and we want to 
> > get them during rule transformation, we want to have the rule call return 
> > our own RelMetadataQuery. In order to achieve that, we sub-classed 
> > RelMetadataQuery, and sub-classed RelOptCluster to make metadata query 
> > overridable, and return the instance of subclass of RelMetadataQuery when 
> > the mq is null. Indeed, we can achieve that goal by just call 
> > rel.metadata(), but we want a efficient and convenient way to do that. That 
> > is why we want RelMetadataQuery in RelOptCluster customizable. 
> > Is it a good practice? If not, any other better approach to suggest?
> > Thanks.
> > Haisheng
> > 
> > ------------------------------------------------------------------
> > 发件人:Stamatis Zampetakis<zabe...@gmail.com>
> > 日 期:2019年06月15日 15:47:58
> > 收件人:<dev@calcite.apache.org>
> > 主 题:Re: [DISCUSSION] Extension of Metadata Query
> > 
> > I had a look on CALCITE-1147 (which added support for custom providers and
> > metadata) and based on the added tests [1], it seems that the main
> > alternative to RelMetadataQuery is indeed through the MetadataFactory. One
> > other alternative I can think of, is to pass a custom RelMetadataQuery (or
> > a query factory) through the context of the planner [2].
> > 
> > Given that we have already methods to obtain a RelMetadataQuery in
> > RelOptRuleCall and RelOptCluster it may make sense to allow the type of
> > query to be pluggable. However, I have not really used this mechanism in
> > the past so I cannot be sure what's the best way to proceed.
> > 
> > Best,
> > Stamatis
> > 
> > [1]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
> > [2]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256
> > 
> > On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen <yuzhao....@gmail.com> wrote:
> > 
> >> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> >> which I believe is the most common routine to query metadata during
> >> planning, I’m a little confused for your saying
> >> 
> >>> The methods in RelMetadataQuery are for convenience only.
> >> 
> >> If these methods in RelMetadataQuery are only for convenience, what is the
> >> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> >> ? (Users would never expect to use this because it’s based on Java
> >> Reflection and has performance problem)
> >> 
> >> [1]
> >> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
> >> 
> >> Best,
> >> Danny Chan
> >> 在 2019年6月12日 +0800 AM6:27,Julian Hyde <jh...@apache.org>,写道:
> >>> The goal is that it should be possible to add a new kind of metadata.
> >> The methods in RelMetadataQuery are for convenience only. So you should be
> >> able to use your new kind of metadata, and existing ones, without modifying
> >> RelMetadataQuery.
> >>> 
> >>> If that is not possible, it’s a bug, and you should log a JIRA case.
> >>> 
> >>> Given how the methods re-generate handlers (by catching exceptions and
> >> modifying mutable private members) I can see that new kinds of metadata
> >> might be difficult to add.
> >>> 
> >>> Julian
> >>> 
> >>> 
> >>>> On Jun 9, 2019, at 7:54 PM, Yuzhao Chen <yuzhao....@gmail.com> wrote:
> >>>> 
> >>>> Thanks, Stamatis
> >>>> 
> >>>> You are right, this discussion came up due to CALCITE-2885, it is not
> >> about the performance problem, it is the extension of RelMetadataQuery,
> >> because we add all kinds of top interfaces in RelMetadataQuery, e.g. the
> >> row count query [1].
> >>>> 
> >>>> When we add a new RelMetadataProvider, a corresponding
> >> interface/method should be added into RelMetadataQuery, but for current
> >> RelOpeCluster impl, we could not do that(with a always default instance)
> >>>> 
> >>>> As for the methods in RelMetadataProvider, i only saw the usage of
> >> #handlers in RelMetadataQuery, and saw the reference you pase, it seems not
> >> that relevant with this topic. What do you think ?
> >>>> 
> >>>> [1]
> >> https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> >>>> 
> >>>> Best,
> >>>> Danny Chan
> >>>> 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis <zabe...@gmail.com>,写道:
> >>>>> Thanks for bringing this up Danny.
> >>>>> 
> >>>>> I guess the discussion came up due to CALCITE-2885 [1]. Looking back
> >> into
> >>>>> it, I am not sure that the intention there is to make the
> >> RelMetadataQuery
> >>>>> pluggable. We could possibly solve the performance problem without
> >>>>> extending the RelMetadataQuery. We have to look again into this case.
> >>>>> 
> >>>>> For more details regarding the existence of the two methods in
> >>>>> RelMetadataProvider have a look in CALCITE-604 [2]. More general for
> >> the
> >>>>> design of RelMetadataProvider you may find useful the description in
> >> [3].
> >>>>> 
> >>>>> Best,
> >>>>> Stamatis
> >>>>> 
> >>>>> [1] https://issues.apache.org/jira/browse/CALCITE-2885
> >>>>> [2] https://issues.apache.org/jira/browse/CALCITE-604
> >>>>> [3]
> >>>>> 
> >> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
> >>>>> 
> >>>>> On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde <jh...@apache.org> wrote:
> >>>>> 
> >>>>>>> 1. Why we have 2 methods in RelMetadataProvider?
> >>>>>> 
> >>>>>> The metadata system is complicated. We need to allow multiple
> >> handlers
> >>>>>> for any given call. So, making a metadata call involves multiple
> >>>>>> dispatch [1] based on the metadata method being called, the type of
> >>>>>> RelNode, and the handlers that are registered. Also it needs to
> >> cache
> >>>>>> results, and detect cycles. And the dispatch needs to be
> >> efficient, so
> >>>>>> we generate janino code to do the dispatch, and re-generate when
> >> new
> >>>>>> handlers or sub-classes of RelNode are added.
> >>>>>> 
> >>>>>> I forget details, the two methods are basically required to allow
> >> us
> >>>>>> to generate code to do the dispatch.
> >>>>>> 
> >>>>>>> 2. We should make the RelMetadataQuery in RelOptCluster
> >> pluggable.
> >>>>>> 
> >>>>>> I disagree. RelMetadataQuery is not an extension point. Its sole
> >>>>>> purpose is to to keep state (the cache and cycle-checking).
> >>>>>> RelMetadataProvider is the extension point. (By analogy, if you are
> >>>>>> un-parsing an AST, you let each AST sub-class handle unparsing
> >> itself,
> >>>>>> but the unparsed text goes to a simple StringBuilder.
> >> RelMetadataQuery
> >>>>>> is in the role of the StringBuilder. In a complex system, it is
> >> nice
> >>>>>> to keep some of the components simple, or at least keep them to
> >>>>>> prescribed roles.)
> >>>>>> 
> >>>>>> Julian
> >>>>>> 
> >>>>>> [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> >>>>>> 
> >>>>>> On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen <yuzhao....@gmail.com>
> >> wrote:
> >>>>>>> 
> >>>>>>> Currently we provide answer to metadata query through
> >>>>>> RelMetadataProvider [1], there are some sub-classes of it:
> >>>>>>> 
> >>>>>>> RelMetadataProvider
> >>>>>>> |
> >>>>>>> |- VolcanoRelMetadataProvider
> >>>>>>> |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> >>>>>>> |- HepRelMetadataProvider
> >>>>>>> |- CachingRelMetadataProvider
> >>>>>>> |- ReflectiveRelMetadataProvider
> >>>>>>> |- JaninoRelMetadataProvider
> >>>>>>> 
> >>>>>>> The RelMetadataProvider has two methods: #apply and #handlers,
> >> the
> >>>>>> #apply method seems a programming interface and there is a demo
> >> code how we
> >>>>>> can use it:
> >>>>>>> 
> >>>>>>> RelMetadataProvider provider;
> >>>>>>> LogicalFilter filter;
> >>>>>>> RexNode predicate;
> >>>>>>> Function<RelNode, Metadata> function =
> >>>>>>> provider.apply(LogicalFilter.class, Selectivity.class};
> >>>>>>> Selectivity selectivity = function.apply(filter);
> >>>>>>> Double d = selectivity.selectivity(predicate);
> >>>>>>> 
> >>>>>>> But let's see our RelOptCluster's member variables[2], there are
> >>>>>> MetadataFactory and RelMetadataQuery which all can be used to
> >> query the
> >>>>>> metadata, for MetadataFactory, there is a default impl named
> >>>>>> MetadataFactoryImpl which will invoke RelMetadataProvider#apply
> >> internally,
> >>>>>> for RelMetadataQuery, it will invoke RelMetadataProvider#handlers
> >> (finally
> >>>>>> composed and codeden by JaninoRelMetadataProvider).
> >>>>>>> 
> >>>>>>> In our planning phrase, we can invoke
> >> RelOptRuleCall#getMetadataQuery to
> >>>>>> get the MQ and query the metadata.
> >>>>>>> 
> >>>>>>> For extension of metadata handlers, we can set our customized
> >>>>>> RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery,
> >> we have
> >>>>>> no way to extend it now, because the RelOptCluster always has a
> >> singleton
> >>>>>> instance [4] which is only the default implementation.
> >>>>>>> 
> >>>>>>> 
> >>>>>>> My question is as follows:
> >>>>>>> 
> >>>>>>> 1. Why we have 2 methods in RelMetadataProvider, and why we need
> >> the
> >>>>>> MetadataFactory and RelMetadataProvider#apply ? It seems that it's
> >> function
> >>>>>> is already been overriden by RelMetadataQuery(The difference is
> >> that
> >>>>>> MetadataFactory use Reflection and RelMetadataQuery use gened
> >> bytes code).
> >>>>>>> 2. We should make the RelMetadataQuery in RelOptCluster
> >> pluggable.
> >>>>>>> 
> >>>>>>> 
> >>>>>>> [1]
> >>>>>> 
> >> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> >>>>>>> [2]
> >>>>>> 
> >> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
> >>>>>>> [3]
> >>>>>> 
> >> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
> >>>>>>> [4]
> >>>>>> 
> >> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
> >>>>>>> 
> >>>>>>> 
> >>>>>>> 
> >>>>>>> Best,
> >>>>>>> Danny Chan
> >>>>>> 
> >>> 
> >> 
> > 
> 
> 


Reply via email to