Thanks for raising this, Danny. Actually I have the same question too. > RelMetadataQuery is not an extension point. Its sole purpose is to to keep state (the cache and cycle-checking) I think users may extend the RelMetadataQuery if wanting to query more metadata, such as TopK values.
Best, Chunwei On Wed, Jun 5, 2019 at 6:44 AM Stamatis Zampetakis <zabe...@gmail.com> wrote: > Thanks for bringing this up Danny. > > I guess the discussion came up due to CALCITE-2885 [1]. Looking back into > it, I am not sure that the intention there is to make the RelMetadataQuery > pluggable. We could possibly solve the performance problem without > extending the RelMetadataQuery. We have to look again into this case. > > For more details regarding the existence of the two methods in > RelMetadataProvider have a look in CALCITE-604 [2]. More general for the > design of RelMetadataProvider you may find useful the description in [3]. > > Best, > Stamatis > > [1] https://issues.apache.org/jira/browse/CALCITE-2885 > [2] https://issues.apache.org/jira/browse/CALCITE-604 > [3] > > https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/ > > On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde <jh...@apache.org> wrote: > > > > 1. Why we have 2 methods in RelMetadataProvider? > > > > The metadata system is complicated. We need to allow multiple handlers > > for any given call. So, making a metadata call involves multiple > > dispatch [1] based on the metadata method being called, the type of > > RelNode, and the handlers that are registered. Also it needs to cache > > results, and detect cycles. And the dispatch needs to be efficient, so > > we generate janino code to do the dispatch, and re-generate when new > > handlers or sub-classes of RelNode are added. > > > > I forget details, the two methods are basically required to allow us > > to generate code to do the dispatch. > > > > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable. > > > > I disagree. RelMetadataQuery is not an extension point. Its sole > > purpose is to to keep state (the cache and cycle-checking). > > RelMetadataProvider is the extension point. (By analogy, if you are > > un-parsing an AST, you let each AST sub-class handle unparsing itself, > > but the unparsed text goes to a simple StringBuilder. RelMetadataQuery > > is in the role of the StringBuilder. In a complex system, it is nice > > to keep some of the components simple, or at least keep them to > > prescribed roles.) > > > > Julian > > > > [1] https://en.wikipedia.org/wiki/Multiple_dispatch > > > > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen <yuzhao....@gmail.com> > wrote: > > > > > > Currently we provide answer to metadata query through > > RelMetadataProvider [1], there are some sub-classes of it: > > > > > > RelMetadataProvider > > > | > > > |- VolcanoRelMetadataProvider > > > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider > > > |- HepRelMetadataProvider > > > |- CachingRelMetadataProvider > > > |- ReflectiveRelMetadataProvider > > > |- JaninoRelMetadataProvider > > > > > > The RelMetadataProvider has two methods: #apply and #handlers, the > > #apply method seems a programming interface and there is a demo code how > we > > can use it: > > > > > > RelMetadataProvider provider; > > > LogicalFilter filter; > > > RexNode predicate; > > > Function<RelNode, Metadata> function = > > > provider.apply(LogicalFilter.class, Selectivity.class}; > > > Selectivity selectivity = function.apply(filter); > > > Double d = selectivity.selectivity(predicate); > > > > > > But let's see our RelOptCluster's member variables[2], there are > > MetadataFactory and RelMetadataQuery which all can be used to query the > > metadata, for MetadataFactory, there is a default impl named > > MetadataFactoryImpl which will invoke RelMetadataProvider#apply > internally, > > for RelMetadataQuery, it will invoke RelMetadataProvider#handlers > (finally > > composed and codeden by JaninoRelMetadataProvider). > > > > > > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery > to > > get the MQ and query the metadata. > > > > > > For extension of metadata handlers, we can set our customized > > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we > have > > no way to extend it now, because the RelOptCluster always has a singleton > > instance [4] which is only the default implementation. > > > > > > > > > My question is as follows: > > > > > > 1. Why we have 2 methods in RelMetadataProvider, and why we need the > > MetadataFactory and RelMetadataProvider#apply ? It seems that it's > function > > is already been overriden by RelMetadataQuery(The difference is that > > MetadataFactory use Reflection and RelMetadataQuery use gened bytes > code). > > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable. > > > > > > > > > [1] > > > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38 > > > [2] > > > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49 > > > [3] > > > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135 > > > [4] > > > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151 > > > > > > > > > > > > Best, > > > Danny Chan > > >