Re: [DISCUSSION] Extension of Metadata Query

2019-10-27 Thread Danny Chan
Can someone review this patch for me 
https://github.com/apache/calcite/pull/1533 ? Thanks so much for that ~

Best,
Danny Chan
在 2019年10月25日 +0800 AM9:02,Julian Hyde ,写道:
> Sure.
>
> Let’s have a test that creates such a sub-class. Then people can use it as an 
> example.
>
> Also, let’s make sure that a sub-sub-class of RelMetadataQuery also works.
>
> Perhaps split RelMetadataQuery into a base class (containing the essential 
> mechanism) and a derived class (containing the handlers and methods for each 
> kind of built-in metadata). It will be interesting to see which code needs 
> the derived class and how which code needs only the base class.
>
> Julian
>
>
> > On Oct 24, 2019, at 4:56 PM, Haisheng Yuan  wrote:
> >
> > Sounds like we have reached a consensus, I guess.
> > Can we create a JIRA to allow users to create their own sub-class of 
> > RelMetadataQuery?
> >
> > - Haisheng
> >
> > --------------
> > 发件人:Julian Hyde
> > 日 期:2019年10月22日 09:58:34
> > 收件人:dev
> > 主 题:Re: [DISCUSSION] Extension of Metadata Query
> >
> > Yes, the division of labor between MetadataFactory and RelMetadataQuery is 
> > a good one, and we should keep them intact. One provides the raw metadata, 
> > and the other provides a typed interface and transactions/caching.
> >
> > It might be allowable for a user to create their own sub-class of 
> > RelMetadataQuery that adds only handler fields and metadata methods, 
> > provided that it follows the existing pattern. We could add a new 
> > thread-local in RelMetadataQuery.instance() that is a factory to create the 
> > required sub-class of RelMetadataQuery.
> >
> > The process by which metadata factories re-generate themselves is delicate 
> > and subtle:
> >
> > public Double getMaxRowCount(RelNode rel) {
> > for (;;) {
> > try {
> > return maxRowCountHandler.getMaxRowCount(rel, this);
> > } catch (JaninoRelMetadataProvider.NoHandler e) {
> > maxRowCountHandler =
> > revise(e.relClass, BuiltInMetadata.MaxRowCount.DEF);
> > }
> > }
> > }
> >
> > (Note that “revise” generates a new class, creating bytecode via janino, 
> > and the generated code throws “NoHandler" when it needs to extend itself.) 
> > I don’t trust the typical Calcite user to be able to write a sub-class that 
> > works correctly and efficiently.
> >
> > Julian
> >
> >
> >
> > > On Oct 18, 2019, at 3:55 AM, XING JIN  wrote:
> > >
> > > +1 on Danny's comment.
> > > If we use MedataFactory to customize and use RelMetadataQuery for
> > > convenience, that will make user confused.
> > >
> > > Danny Chan  于2019年10月18日周五 下午12:33写道:
> > >
> > > > That is the point, we should supply a way to extend the RelMetadataQuery
> > > > conveniently for Calcite, because in most of the RelOptRules, user would
> > > > use the code like:
> > > >
> > > > RelOptRuleCall.getMetadataQuery
> > > >
> > > > To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
> > > > MedataFactory.
> > > >
> > > > We should at lest unity the metadata query entrance/interfaces, or it
> > > > would confuse a lot.
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> > > > > At least in our project (Apache Ignite) we use
> > > > AbstractRelNode.metadata().
> > > > >
> > > > > But it so because there is no way to put our metadata type into
> > > > > RelMetadataQuery without changes in Calcite.
> > > > >
> > > > > Regards,
> > > > > Igor
> > > > >
> > > > > чт, 17 окт. 2019 г., 19:16 Xiening Dai :
> > > > >
> > > > > > MetadataFactory is still useful. It provides a way to access 
> > > > > > Metadata
> > > > > > directly. If someone creates a new type of Metadata class, it can be
> > > > > > accessed through AbstractRelNode.metadata(). This way you don’t 
> > > > > > need to
> > > > > > update RelMetadataQuery interface to include the getter for this new
> > > > meta.
> > > > > > Although I don’t see this pattern being used often, but I do think 
> > > > > > it
> > > > is
> > > > > > still useful and shouldn’t be removed.
> &g

Re: [DISCUSSION] Extension of Metadata Query

2019-10-24 Thread Danny Chan
I have created a JIRA issue https://issues.apache.org/jira/browse/CALCITE-3446 
and start working ~

Best,
Danny Chan
在 2019年10月25日 +0800 AM9:02,Julian Hyde ,写道:
> Sure.
>
> Let’s have a test that creates such a sub-class. Then people can use it as an 
> example.
>
> Also, let’s make sure that a sub-sub-class of RelMetadataQuery also works.
>
> Perhaps split RelMetadataQuery into a base class (containing the essential 
> mechanism) and a derived class (containing the handlers and methods for each 
> kind of built-in metadata). It will be interesting to see which code needs 
> the derived class and how which code needs only the base class.
>
> Julian
>
>
> > On Oct 24, 2019, at 4:56 PM, Haisheng Yuan  wrote:
> >
> > Sounds like we have reached a consensus, I guess.
> > Can we create a JIRA to allow users to create their own sub-class of 
> > RelMetadataQuery?
> >
> > - Haisheng
> >
> > --------------
> > 发件人:Julian Hyde
> > 日 期:2019年10月22日 09:58:34
> > 收件人:dev
> > 主 题:Re: [DISCUSSION] Extension of Metadata Query
> >
> > Yes, the division of labor between MetadataFactory and RelMetadataQuery is 
> > a good one, and we should keep them intact. One provides the raw metadata, 
> > and the other provides a typed interface and transactions/caching.
> >
> > It might be allowable for a user to create their own sub-class of 
> > RelMetadataQuery that adds only handler fields and metadata methods, 
> > provided that it follows the existing pattern. We could add a new 
> > thread-local in RelMetadataQuery.instance() that is a factory to create the 
> > required sub-class of RelMetadataQuery.
> >
> > The process by which metadata factories re-generate themselves is delicate 
> > and subtle:
> >
> > public Double getMaxRowCount(RelNode rel) {
> > for (;;) {
> > try {
> > return maxRowCountHandler.getMaxRowCount(rel, this);
> > } catch (JaninoRelMetadataProvider.NoHandler e) {
> > maxRowCountHandler =
> > revise(e.relClass, BuiltInMetadata.MaxRowCount.DEF);
> > }
> > }
> > }
> >
> > (Note that “revise” generates a new class, creating bytecode via janino, 
> > and the generated code throws “NoHandler" when it needs to extend itself.) 
> > I don’t trust the typical Calcite user to be able to write a sub-class that 
> > works correctly and efficiently.
> >
> > Julian
> >
> >
> >
> > > On Oct 18, 2019, at 3:55 AM, XING JIN  wrote:
> > >
> > > +1 on Danny's comment.
> > > If we use MedataFactory to customize and use RelMetadataQuery for
> > > convenience, that will make user confused.
> > >
> > > Danny Chan  于2019年10月18日周五 下午12:33写道:
> > >
> > > > That is the point, we should supply a way to extend the RelMetadataQuery
> > > > conveniently for Calcite, because in most of the RelOptRules, user would
> > > > use the code like:
> > > >
> > > > RelOptRuleCall.getMetadataQuery
> > > >
> > > > To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
> > > > MedataFactory.
> > > >
> > > > We should at lest unity the metadata query entrance/interfaces, or it
> > > > would confuse a lot.
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> > > > > At least in our project (Apache Ignite) we use
> > > > AbstractRelNode.metadata().
> > > > >
> > > > > But it so because there is no way to put our metadata type into
> > > > > RelMetadataQuery without changes in Calcite.
> > > > >
> > > > > Regards,
> > > > > Igor
> > > > >
> > > > > чт, 17 окт. 2019 г., 19:16 Xiening Dai :
> > > > >
> > > > > > MetadataFactory is still useful. It provides a way to access 
> > > > > > Metadata
> > > > > > directly. If someone creates a new type of Metadata class, it can be
> > > > > > accessed through AbstractRelNode.metadata(). This way you don’t 
> > > > > > need to
> > > > > > update RelMetadataQuery interface to include the getter for this new
> > > > meta.
> > > > > > Although I don’t see this pattern being used often, but I do think 
> > > > > > it
> > > > is
> > > > > > still useful and shouldn’t be removed.
> &g

Re: [DISCUSSION] Extension of Metadata Query

2019-10-24 Thread Julian Hyde
Sure.

Let’s have a test that creates such a sub-class. Then people can use it as an 
example. 

Also, let’s make sure that a sub-sub-class of RelMetadataQuery also works.

Perhaps split RelMetadataQuery into a base class (containing the essential 
mechanism) and a derived class (containing the handlers and methods for each 
kind of built-in metadata). It will be interesting to see which code needs the 
derived class and how which code needs only the base class.

Julian


> On Oct 24, 2019, at 4:56 PM, Haisheng Yuan  wrote:
> 
> Sounds like we have reached a consensus, I guess.
> Can we create a JIRA to allow users to create their own sub-class of 
> RelMetadataQuery?
> 
> - Haisheng
> 
> --
> 发件人:Julian Hyde
> 日 期:2019年10月22日 09:58:34
> 收件人:dev
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> Yes, the division of labor between MetadataFactory and RelMetadataQuery is a 
> good one, and we should keep them intact. One provides the raw metadata, and 
> the other provides a typed interface and transactions/caching.
> 
> It might be allowable for a user to create their own sub-class of 
> RelMetadataQuery that adds only handler fields and metadata methods, provided 
> that it follows the existing pattern. We could add a new thread-local in 
> RelMetadataQuery.instance() that is a factory to create the required 
> sub-class of RelMetadataQuery.
> 
> The process by which metadata factories re-generate themselves is delicate 
> and subtle:
> 
> public Double getMaxRowCount(RelNode rel) {
> for (;;) {
> try {
> return maxRowCountHandler.getMaxRowCount(rel, this);
> } catch (JaninoRelMetadataProvider.NoHandler e) {
> maxRowCountHandler =
> revise(e.relClass, BuiltInMetadata.MaxRowCount.DEF);
> }
> }
> }
> 
> (Note that “revise” generates a new class, creating bytecode via janino, and 
> the generated code throws “NoHandler" when it needs to extend itself.) I 
> don’t trust the typical Calcite user to be able to write a sub-class that 
> works correctly and efficiently.
> 
> Julian
> 
> 
> 
> > On Oct 18, 2019, at 3:55 AM, XING JIN  wrote:
> > 
> > +1 on Danny's comment.
> > If we use MedataFactory to customize and use RelMetadataQuery for
> > convenience, that will make user confused.
> > 
> > Danny Chan  于2019年10月18日周五 下午12:33写道:
> > 
> >> That is the point, we should supply a way to extend the RelMetadataQuery
> >> conveniently for Calcite, because in most of the RelOptRules, user would
> >> use the code like:
> >> 
> >> RelOptRuleCall.getMetadataQuery
> >> 
> >> To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
> >> MedataFactory.
> >> 
> >> We should at lest unity the metadata query entrance/interfaces, or it
> >> would confuse a lot.
> >> 
> >> Best,
> >> Danny Chan
> >> 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> >>> At least in our project (Apache Ignite) we use
> >> AbstractRelNode.metadata().
> >>> 
> >>> But it so because there is no way to put our metadata type into
> >>> RelMetadataQuery without changes in Calcite.
> >>> 
> >>> Regards,
> >>> Igor
> >>> 
> >>> чт, 17 окт. 2019 г., 19:16 Xiening Dai :
> >>> 
> >>>> MetadataFactory is still useful. It provides a way to access Metadata
> >>>> directly. If someone creates a new type of Metadata class, it can be
> >>>> accessed through AbstractRelNode.metadata(). This way you don’t need to
> >>>> update RelMetadataQuery interface to include the getter for this new
> >> meta.
> >>>> Although I don’t see this pattern being used often, but I do think it
> >> is
> >>>> still useful and shouldn’t be removed.
> >>>> 
> >>>> 
> >>>> For your second point, I think you would still need a way to keep
> >>>> RelMetadataQuery object during a rule call. If you choose to create new
> >>>> instance, you will have to pass it around while applying the rule. That
> >>>> actually complicates things a lot.
> >>>> 
> >>>> 
> >>>>> On Oct 17, 2019, at 12:49 AM, XING JIN 
> >> wrote:
> >>>>> 
> >>>>> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> >>>> should
> >>>>> we keep/maintain both of them ? shall we just deprecate
> >> MetadataFactory.
> >>>> I
>

Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-24 Thread Haisheng Yuan
Sounds like we have reached a consensus, I guess.
Can we create a JIRA to allow users to create their own sub-class of 
RelMetadataQuery?

- Haisheng

--
发件人:Julian Hyde
日 期:2019年10月22日 09:58:34
收件人:dev
主 题:Re: [DISCUSSION] Extension of Metadata Query

Yes, the division of labor between MetadataFactory and RelMetadataQuery is a 
good one, and we should keep them intact. One provides the raw metadata, and 
the other provides a typed interface and transactions/caching.

It might be allowable for a user to create their own sub-class of 
RelMetadataQuery that adds only handler fields and metadata methods, provided 
that it follows the existing pattern. We could add a new thread-local in 
RelMetadataQuery.instance() that is a factory to create the required sub-class 
of RelMetadataQuery.

The process by which metadata factories re-generate themselves is delicate and 
subtle:

 public Double getMaxRowCount(RelNode rel) {
 for (;;) {
 try {
 return maxRowCountHandler.getMaxRowCount(rel, this);
 } catch (JaninoRelMetadataProvider.NoHandler e) {
 maxRowCountHandler =
 revise(e.relClass, BuiltInMetadata.MaxRowCount.DEF);
 }
 }
 }

(Note that “revise” generates a new class, creating bytecode via janino, and 
the generated code throws “NoHandler" when it needs to extend itself.) I don’t 
trust the typical Calcite user to be able to write a sub-class that works 
correctly and efficiently.

Julian



> On Oct 18, 2019, at 3:55 AM, XING JIN  wrote:
> 
> +1 on Danny's comment.
> If we use MedataFactory to customize and use RelMetadataQuery for
> convenience, that will make user confused.
> 
> Danny Chan  于2019年10月18日周五 下午12:33写道:
> 
>> That is the point, we should supply a way to extend the RelMetadataQuery
>> conveniently for Calcite, because in most of the RelOptRules, user would
>> use the code like:
>> 
>> RelOptRuleCall.getMetadataQuery
>> 
>> To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
>> MedataFactory.
>> 
>> We should at lest unity the metadata query entrance/interfaces, or it
>> would confuse a lot.
>> 
>> Best,
>> Danny Chan
>> 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
>>> At least in our project (Apache Ignite) we use
>> AbstractRelNode.metadata().
>>> 
>>> But it so because there is no way to put our metadata type into
>>> RelMetadataQuery without changes in Calcite.
>>> 
>>> Regards,
>>> Igor
>>> 
>>> чт, 17 окт. 2019 г., 19:16 Xiening Dai :
>>> 
>>>> MetadataFactory is still useful. It provides a way to access Metadata
>>>> directly. If someone creates a new type of Metadata class, it can be
>>>> accessed through AbstractRelNode.metadata(). This way you don’t need to
>>>> update RelMetadataQuery interface to include the getter for this new
>> meta.
>>>> Although I don’t see this pattern being used often, but I do think it
>> is
>>>> still useful and shouldn’t be removed.
>>>> 
>>>> 
>>>> For your second point, I think you would still need a way to keep
>>>> RelMetadataQuery object during a rule call. If you choose to create new
>>>> instance, you will have to pass it around while applying the rule. That
>>>> actually complicates things a lot.
>>>> 
>>>> 
>>>>> On Oct 17, 2019, at 12:49 AM, XING JIN 
>> wrote:
>>>>> 
>>>>> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
>>>> should
>>>>> we keep/maintain both of them ? shall we just deprecate
>> MetadataFactory.
>>>> I
>>>>> see MetadataFactory is rarely used in current code. Also I
>>>>> think MetadataFactory is not good place to offering customized
>> metadata,
>>>>> which will make user confused for the difference between
>> RelMetadataQuery
>>>>> and MetadataFactory.
>>>>> 
>>>>>> Customized RelMetadataQuery with code generated meta handler for
>>>>> customized metadata, also can provide convenient way to get metadata.
>>>>> It makes sense for me.
>>>>> 
>>>>> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall,
>> shall
>>>> we
>>>>> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
>>>>> metadata but without a RelOptCall, he/she will need to create a new
>>>>> instance of RelMetadataQuery.
>>>>> 
>>>>> Xiening Dai  于2019年10月17日周四 上午2:27写道:
>>&g

Re: [DISCUSSION] Extension of Metadata Query

2019-10-22 Thread Chunwei Lei
I agree with Danny. We should believe that uses can handle it well.

Best,
Chunwei


On Tue, Oct 22, 2019 at 3:26 PM Danny Chan  wrote:

> >  I don’t trust the typical Calcite user to be able to write a sub-class
> that works correctly and efficiently.
>
> People who sub-class RelMetadataQuery are almost those who has deep
> customization of the calcite-core, for example, the MaxCompute or the
> Apache Flink.
>
> We may assume that it is not that easy to implement a perfect extension
> of RelMetadataQuery, but let’s open this chance to them, give them the
> freedom to extension, we should have the confidence.
>
> Best,
> Danny Chan
> 在 2019年10月22日 +0800 AM9:58,dev@calcite.apache.org,写道:
> >
> > I don’t trust the typical Calcite user to be able to write a sub-class
> that works correctly and efficiently.
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-22 Thread Danny Chan
>  I don’t trust the typical Calcite user to be able to write a sub-class that 
>works correctly and efficiently.

People who sub-class RelMetadataQuery are almost those who has deep 
customization of the calcite-core, for example, the MaxCompute or the Apache 
Flink.

We may assume that it is not that easy to implement a perfect extension of 
RelMetadataQuery, but let’s open this chance to them, give them the freedom to 
extension, we should have the confidence.

Best,
Danny Chan
在 2019年10月22日 +0800 AM9:58,dev@calcite.apache.org,写道:
>
> I don’t trust the typical Calcite user to be able to write a sub-class that 
> works correctly and efficiently.


Re: [DISCUSSION] Extension of Metadata Query

2019-10-21 Thread Julian Hyde
Yes, the division of labor between MetadataFactory and RelMetadataQuery is a 
good one, and we should keep them intact. One provides the raw metadata, and 
the other provides a typed interface and transactions/caching.

It might be allowable for a user to create their own sub-class of 
RelMetadataQuery that adds only handler fields and metadata methods, provided 
that it follows the existing pattern. We could add a new thread-local in 
RelMetadataQuery.instance() that is a factory to create the required sub-class 
of RelMetadataQuery.

The process by which metadata factories re-generate themselves is delicate and 
subtle:

  public Double getMaxRowCount(RelNode rel) {
for (;;) {
  try {
return maxRowCountHandler.getMaxRowCount(rel, this);
  } catch (JaninoRelMetadataProvider.NoHandler e) {
maxRowCountHandler =
revise(e.relClass, BuiltInMetadata.MaxRowCount.DEF);
  }
}
  }

(Note that “revise” generates a new class, creating bytecode via janino, and 
the generated code throws “NoHandler" when it needs to extend itself.) I don’t 
trust the typical Calcite user to be able to write a sub-class that works 
correctly and efficiently.

Julian

 

> On Oct 18, 2019, at 3:55 AM, XING JIN  wrote:
> 
> +1 on Danny's comment.
> If we use MedataFactory to customize and use RelMetadataQuery for
> convenience, that will make user confused.
> 
> Danny Chan  于2019年10月18日周五 下午12:33写道:
> 
>> That is the point, we should supply a way to extend the RelMetadataQuery
>> conveniently for Calcite, because in most of the RelOptRules, user would
>> use the code like:
>> 
>> RelOptRuleCall.getMetadataQuery
>> 
>> To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
>> MedataFactory.
>> 
>> We should at lest unity the metadata query entrance/interfaces, or it
>> would confuse a lot.
>> 
>> Best,
>> Danny Chan
>> 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
>>> At least in our project (Apache Ignite) we use
>> AbstractRelNode.metadata().
>>> 
>>> But it so because there is no way to put our metadata type into
>>> RelMetadataQuery without changes in Calcite.
>>> 
>>> Regards,
>>> Igor
>>> 
>>> чт, 17 окт. 2019 г., 19:16 Xiening Dai :
>>> 
 MetadataFactory is still useful. It provides a way to access Metadata
 directly. If someone creates a new type of Metadata class, it can be
 accessed through AbstractRelNode.metadata(). This way you don’t need to
 update RelMetadataQuery interface to include the getter for this new
>> meta.
 Although I don’t see this pattern being used often, but I do think it
>> is
 still useful and shouldn’t be removed.
 
 
 For your second point, I think you would still need a way to keep
 RelMetadataQuery object during a rule call. If you choose to create new
 instance, you will have to pass it around while applying the rule. That
 actually complicates things a lot.
 
 
> On Oct 17, 2019, at 12:49 AM, XING JIN 
>> wrote:
> 
> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
 should
> we keep/maintain both of them ? shall we just deprecate
>> MetadataFactory.
 I
> see MetadataFactory is rarely used in current code. Also I
> think MetadataFactory is not good place to offering customized
>> metadata,
> which will make user confused for the difference between
>> RelMetadataQuery
> and MetadataFactory.
> 
>> Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
> 
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall,
>> shall
 we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
> 
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
> 
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is
>> used.
 And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>> 
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the
>> end
 of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery
>> is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is
>> safe.
>> 
>> 
>>> On Oct 16, 2019, at 1:53 AM, Danny Chan 
>> wrote:
>>> 
>>> This is the reason I was struggling for the discussion.
>>> 
>>> Best,
>>> Danny Chan
>>> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
 
 RelMetadataQuery
>> 
>> 
 
 
>> 



Re: [DISCUSSION] Extension of Metadata Query

2019-10-18 Thread XING JIN
+1 on Danny's comment.
If we use MedataFactory to customize and use RelMetadataQuery for
convenience, that will make user confused.

Danny Chan  于2019年10月18日周五 下午12:33写道:

> That is the point, we should supply a way to extend the RelMetadataQuery
> conveniently for Calcite, because in most of the RelOptRules, user would
> use the code like:
>
> RelOptRuleCall.getMetadataQuery
>
> To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
> MedataFactory.
>
> We should at lest unity the metadata query entrance/interfaces, or it
> would confuse a lot.
>
> Best,
> Danny Chan
> 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> > At least in our project (Apache Ignite) we use
> AbstractRelNode.metadata().
> >
> > But it so because there is no way to put our metadata type into
> > RelMetadataQuery without changes in Calcite.
> >
> > Regards,
> > Igor
> >
> > чт, 17 окт. 2019 г., 19:16 Xiening Dai :
> >
> > > MetadataFactory is still useful. It provides a way to access Metadata
> > > directly. If someone creates a new type of Metadata class, it can be
> > > accessed through AbstractRelNode.metadata(). This way you don’t need to
> > > update RelMetadataQuery interface to include the getter for this new
> meta.
> > > Although I don’t see this pattern being used often, but I do think it
> is
> > > still useful and shouldn’t be removed.
> > >
> > >
> > > For your second point, I think you would still need a way to keep
> > > RelMetadataQuery object during a rule call. If you choose to create new
> > > instance, you will have to pass it around while applying the rule. That
> > > actually complicates things a lot.
> > >
> > >
> > > > On Oct 17, 2019, at 12:49 AM, XING JIN 
> wrote:
> > > >
> > > > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> > > should
> > > > we keep/maintain both of them ? shall we just deprecate
> MetadataFactory.
> > > I
> > > > see MetadataFactory is rarely used in current code. Also I
> > > > think MetadataFactory is not good place to offering customized
> metadata,
> > > > which will make user confused for the difference between
> RelMetadataQuery
> > > > and MetadataFactory.
> > > >
> > > > > Customized RelMetadataQuery with code generated meta handler for
> > > > customized metadata, also can provide convenient way to get metadata.
> > > > It makes sense for me.
> > > >
> > > > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall,
> shall
> > > we
> > > > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > > > metadata but without a RelOptCall, he/she will need to create a new
> > > > instance of RelMetadataQuery.
> > > >
> > > > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> > > >
> > > > > I have seen both patterns in current code base. In most places, for
> > > > > example SubQueryRemoveRule, AggregateUnionTrasposeRule
> > > > > SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is
> used.
> > > And
> > > > > there are a few other places where new RelMetadataQuery instance is
> > > > > created, which Haisheng attempts to fix.
> > > > >
> > > > > Currently RelOptCluster.invalidateMetadataQuery() is called at the
> end
> > > of
> > > > > RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery
> is
> > > > > guaranteed to be within a RelOptCall. I think Haisheng’s fix is
> safe.
> > > > >
> > > > >
> > > > > > On Oct 16, 2019, at 1:53 AM, Danny Chan 
> wrote:
> > > > > >
> > > > > > This is the reason I was struggling for the discussion.
> > > > > >
> > > > > > Best,
> > > > > > Danny Chan
> > > > > > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> > > > > > >
> > > > > > > RelMetadataQuery
> > > > >
> > > > >
> > >
> > >
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Danny Chan
That is the point, we should supply a way to extend the RelMetadataQuery 
conveniently for Calcite, because in most of the RelOptRules, user would use 
the code like:

RelOptRuleCall.getMetadataQuery

To get a RMQ instead of using AbstractRelNode.metadata() to fetch a 
MedataFactory.

We should at lest unity the metadata query entrance/interfaces, or it would 
confuse a lot.

Best,
Danny Chan
在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> At least in our project (Apache Ignite) we use AbstractRelNode.metadata().
>
> But it so because there is no way to put our metadata type into
> RelMetadataQuery without changes in Calcite.
>
> Regards,
> Igor
>
> чт, 17 окт. 2019 г., 19:16 Xiening Dai :
>
> > MetadataFactory is still useful. It provides a way to access Metadata
> > directly. If someone creates a new type of Metadata class, it can be
> > accessed through AbstractRelNode.metadata(). This way you don’t need to
> > update RelMetadataQuery interface to include the getter for this new meta.
> > Although I don’t see this pattern being used often, but I do think it is
> > still useful and shouldn’t be removed.
> >
> >
> > For your second point, I think you would still need a way to keep
> > RelMetadataQuery object during a rule call. If you choose to create new
> > instance, you will have to pass it around while applying the rule. That
> > actually complicates things a lot.
> >
> >
> > > On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> > >
> > > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> > should
> > > we keep/maintain both of them ? shall we just deprecate MetadataFactory.
> > I
> > > see MetadataFactory is rarely used in current code. Also I
> > > think MetadataFactory is not good place to offering customized metadata,
> > > which will make user confused for the difference between RelMetadataQuery
> > > and MetadataFactory.
> > >
> > > > Customized RelMetadataQuery with code generated meta handler for
> > > customized metadata, also can provide convenient way to get metadata.
> > > It makes sense for me.
> > >
> > > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall
> > we
> > > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > > metadata but without a RelOptCall, he/she will need to create a new
> > > instance of RelMetadataQuery.
> > >
> > > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> > >
> > > > I have seen both patterns in current code base. In most places, for
> > > > example SubQueryRemoveRule, AggregateUnionTrasposeRule
> > > > SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used.
> > And
> > > > there are a few other places where new RelMetadataQuery instance is
> > > > created, which Haisheng attempts to fix.
> > > >
> > > > Currently RelOptCluster.invalidateMetadataQuery() is called at the end
> > of
> > > > RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> > > > guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
> > > >
> > > >
> > > > > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> > > > >
> > > > > This is the reason I was struggling for the discussion.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> > > > > >
> > > > > > RelMetadataQuery
> > > >
> > > >
> >
> >


Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Haisheng Yuan
MetadataFactory's implementation provides a unified, reflective approach to 
retrieve metadata, no matter the metadata is builtin or extended type. 
RelMetadataQuery provides convenient methods (in codegen) to get builtin 
metadata, but not for costomized metadata. Unless we recommend sub-classing 
RelMetadataQuery as the way to add costomized metadata, deprecating 
MetadataFactory is not a wise choice.

- Haisheng

--
发件人:XING JIN
日 期:2019年10月17日 15:55:31
收件人:
主 题:Re: [DISCUSSION] Extension of Metadata Query

BTW, I think one JIRA number discussed in the thread would be
https://issues.apache.org/jira/browse/CALCITE-2855 not CALCITE-2885

Best,
Jin

XING JIN  于2019年10月17日周四 下午3:49写道:

> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should we keep/maintain both of them ? shall we just
> deprecate MetadataFactory. I see MetadataFactory is rarely used in current
> code. Also I think MetadataFactory is not good place to offering customized
> metadata, which will make user confused for the difference
> between RelMetadataQuery and MetadataFactory.
>
> > Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
>
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
>
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
>
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>>
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>>
>>
>> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>> >
>> > This is the reason I was struggling for the discussion.
>> >
>> > Best,
>> > Danny Chan
>> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>> >>
>> >> RelMetadataQuery
>>
>>



Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Seliverstov Igor
At least in our project (Apache Ignite) we use AbstractRelNode.metadata().

But it so because there is no way to put our metadata type into
RelMetadataQuery without changes in Calcite.

Regards,
Igor

чт, 17 окт. 2019 г., 19:16 Xiening Dai :

> MetadataFactory is still useful. It provides a way to access Metadata
> directly. If someone creates a new type of Metadata class, it can be
> accessed through AbstractRelNode.metadata(). This way you don’t need to
> update RelMetadataQuery interface to include the getter for this new meta.
> Although I don’t see this pattern being used often, but I do think it is
> still useful and shouldn’t be removed.
>
>
> For your second point, I think you would still need a way to keep
> RelMetadataQuery object during a rule call. If you choose to create new
> instance, you will have to pass it around while applying the rule. That
> actually complicates things a lot.
>
>
> > On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> >
> > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should
> > we keep/maintain both of them ? shall we just deprecate MetadataFactory.
> I
> > see MetadataFactory is rarely used in current code. Also I
> > think MetadataFactory is not good place to offering customized metadata,
> > which will make user confused for the difference between RelMetadataQuery
> > and MetadataFactory.
> >
> >> Customized RelMetadataQuery with code generated meta handler for
> > customized metadata, also can provide convenient way to get metadata.
> > It makes sense for me.
> >
> > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall
> we
> > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > metadata but without a RelOptCall, he/she will need to create a new
> > instance of RelMetadataQuery.
> >
> > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> >
> >> I have seen both patterns in current code base. In most places, for
> >> example SubQueryRemoveRule, AggregateUnionTrasposeRule
> >> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used.
> And
> >> there are a few other places where new RelMetadataQuery instance is
> >> created, which Haisheng attempts to fix.
> >>
> >> Currently RelOptCluster.invalidateMetadataQuery() is called at the end
> of
> >> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> >> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
> >>
> >>
> >>> On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> >>>
> >>> This is the reason I was struggling for the discussion.
> >>>
> >>> Best,
> >>> Danny Chan
> >>> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> 
>  RelMetadataQuery
> >>
> >>
>
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Xiening Dai
MetadataFactory is still useful. It provides a way to access Metadata directly. 
If someone creates a new type of Metadata class, it can be accessed through 
AbstractRelNode.metadata(). This way you don’t need to update RelMetadataQuery 
interface to include the getter for this new meta. Although I don’t see this 
pattern being used often, but I do think it is still useful and shouldn’t be 
removed.


For your second point, I think you would still need a way to keep 
RelMetadataQuery object during a rule call. If you choose to create new 
instance, you will have to pass it around while applying the rule. That 
actually complicates things a lot. 


> On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> 
> 1. RelMetadataQuery covers the functionality of MetadataFactory, why should
> we keep/maintain both of them ? shall we just deprecate MetadataFactory. I
> see MetadataFactory is rarely used in current code. Also I
> think MetadataFactory is not good place to offering customized metadata,
> which will make user confused for the difference between RelMetadataQuery
> and MetadataFactory.
> 
>> Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
> 
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
> 
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
> 
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>> 
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>> 
>> 
>>> On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>>> 
>>> This is the reason I was struggling for the discussion.
>>> 
>>> Best,
>>> Danny Chan
>>> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
 
 RelMetadataQuery
>> 
>> 



Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread XING JIN
BTW, I think one JIRA number discussed in the thread would be
https://issues.apache.org/jira/browse/CALCITE-2855 not CALCITE-2885

Best,
Jin

XING JIN  于2019年10月17日周四 下午3:49写道:

> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should we keep/maintain both of them ? shall we just
> deprecate MetadataFactory. I see MetadataFactory is rarely used in current
> code. Also I think MetadataFactory is not good place to offering customized
> metadata, which will make user confused for the difference
> between RelMetadataQuery and MetadataFactory.
>
> > Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
>
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
>
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
>
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>>
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>>
>>
>> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>> >
>> > This is the reason I was struggling for the discussion.
>> >
>> > Best,
>> > Danny Chan
>> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>> >>
>> >> RelMetadataQuery
>>
>>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread XING JIN
1. RelMetadataQuery covers the functionality of MetadataFactory, why should
we keep/maintain both of them ? shall we just deprecate MetadataFactory. I
see MetadataFactory is rarely used in current code. Also I
think MetadataFactory is not good place to offering customized metadata,
which will make user confused for the difference between RelMetadataQuery
and MetadataFactory.

> Customized RelMetadataQuery with code generated meta handler for
customized metadata, also can provide convenient way to get metadata.
It makes sense for me.

2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
metadata but without a RelOptCall, he/she will need to create a new
instance of RelMetadataQuery.

Xiening Dai  于2019年10月17日周四 上午2:27写道:

> I have seen both patterns in current code base. In most places, for
> example SubQueryRemoveRule, AggregateUnionTrasposeRule
> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
> there are a few other places where new RelMetadataQuery instance is
> created, which Haisheng attempts to fix.
>
> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>
>
> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> >
> > This is the reason I was struggling for the discussion.
> >
> > Best,
> > Danny Chan
> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> >>
> >> RelMetadataQuery
>
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-16 Thread Xiening Dai
I have seen both patterns in current code base. In most places, for example 
SubQueryRemoveRule, AggregateUnionTrasposeRule SortJoinTransposeRule, etc., 
RelOptCluster.getMetadataQuery() is used. And there are a few other places 
where new RelMetadataQuery instance is created, which Haisheng attempts to fix.

Currently RelOptCluster.invalidateMetadataQuery() is called at the end of 
RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is guaranteed 
to be within a RelOptCall. I think Haisheng’s fix is safe.


> On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> 
> This is the reason I was struggling for the discussion.
> 
> Best,
> Danny Chan
> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>> 
>> RelMetadataQuery



Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-16 Thread Danny Chan
This is the reason I was struggling for the discussion.

Best,
Danny Chan
在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>
> RelMetadataQuery


Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-15 Thread Haisheng Yuan
That's a good question.
Yes. And the method getMetadataQuery from RelOptRuleCall also retrieves 
MetadataQuery from RelOptCluster.
Any other better places?

- Haisheng

--
发件人:Julian Hyde
日 期:2019年10月16日 11:02:33
收件人:dev
主 题:Re: [DISCUSSION] Extension of Metadata Query

Yes, but why RelOptCluster? Besides the fact that there happened to be a method 
there.


> On Oct 15, 2019, at 7:30 PM, Haisheng Yuan  wrote:
> 
> > Is there a good rationale for attaching one to RelOptCluster
> Customized RelMetadataQuery with code generated meta handler for customized 
> metadata, also can provide convenient way to get metadata.
> 
> - Haisheng
> 
> --
> 发件人:Julian Hyde
> 日 期:2019年10月16日 06:52:39
> 收件人:
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> Having reviewed https://issues.apache.org/jira/browse/CALCITE-3403 
> <https://issues.apache.org/jira/browse/CALCITE-3403> and 
> https://issues.apache.org/jira/browse/CALCITE-2018 
> <https://issues.apache.org/jira/browse/CALCITE-2018> today I am worried about 
> the direction RelMetadataQuery is taking. Is there a good rationale for 
> attaching one to RelOptCluster? (Besides “it seems to work”?)
> 
> The natural lifespan of a RelMetadataQuery is a RelOptCall (because the 
> structure of the tree is constant for the duration of a call).
> 
> As for sub-classing RelMetadataQuery, I really don’t know. It’s probably 
> harmless, possibly disastrous, and it takes a lot of effort to determine 
> which. We have to be really cautious with RelMetadataQuery because it is one 
> of the few components in Calcite that mutate.
> 
> Julian
> 
> 
> 
> 
> > On Oct 15, 2019, at 1:42 PM, Haisheng Yuan  wrote:
> > 
> > I would like to bring this discussion up again.
> > 
> > We created our own metadata provider, and set it to RelMetadataQuery by 
> > calling
> > RelMetadataQuery.THREAD_PROVIDERS.set()
> > If we just use all the types of metadata provided by Calcite, it should be 
> > fine. But given that we have created new types of metadata, and we want to 
> > get them during rule transformation, we want to have the rule call return 
> > our own RelMetadataQuery. In order to achieve that, we sub-classed 
> > RelMetadataQuery, and sub-classed RelOptCluster to make metadata query 
> > overridable, and return the instance of subclass of RelMetadataQuery when 
> > the mq is null. Indeed, we can achieve that goal by just call 
> > rel.metadata(), but we want a efficient and convenient way to do that. That 
> > is why we want RelMetadataQuery in RelOptCluster customizable. 
> > Is it a good practice? If not, any other better approach to suggest?
> > Thanks.
> > Haisheng
> > 
> > --
> > 发件人:Stamatis Zampetakis
> > 日 期:2019年06月15日 15:47:58
> > 收件人:
> > 主 题:Re: [DISCUSSION] Extension of Metadata Query
> > 
> > I had a look on CALCITE-1147 (which added support for custom providers and
> > metadata) and based on the added tests [1], it seems that the main
> > alternative to RelMetadataQuery is indeed through the MetadataFactory. One
> > other alternative I can think of, is to pass a custom RelMetadataQuery (or
> > a query factory) through the context of the planner [2].
> > 
> > Given that we have already methods to obtain a RelMetadataQuery in
> > RelOptRuleCall and RelOptCluster it may make sense to allow the type of
> > query to be pluggable. However, I have not really used this mechanism in
> > the past so I cannot be sure what's the best way to proceed.
> > 
> > Best,
> > Stamatis
> > 
> > [1]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
> > [2]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256
> > 
> > On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:
> > 
> >> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> >> which I believe is the most common routine to query metadata during
> >> planning, I’m a little confused for your saying
> >> 
> >>> The methods in RelMetadataQuery are for convenience only.
> >> 
> >> If these methods in RelMetadataQuery are only for convenience, what is the
> >> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> >> ? (Users w

Re: [DISCUSSION] Extension of Metadata Query

2019-10-15 Thread Julian Hyde
Yes, but why RelOptCluster? Besides the fact that there happened to be a method 
there.


> On Oct 15, 2019, at 7:30 PM, Haisheng Yuan  wrote:
> 
> > Is there a good rationale for attaching one to RelOptCluster
> Customized RelMetadataQuery with code generated meta handler for customized 
> metadata, also can provide convenient way to get metadata.
> 
> - Haisheng
> 
> --
> 发件人:Julian Hyde
> 日 期:2019年10月16日 06:52:39
> 收件人:
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> Having reviewed https://issues.apache.org/jira/browse/CALCITE-3403 
> <https://issues.apache.org/jira/browse/CALCITE-3403> and 
> https://issues.apache.org/jira/browse/CALCITE-2018 
> <https://issues.apache.org/jira/browse/CALCITE-2018> today I am worried about 
> the direction RelMetadataQuery is taking. Is there a good rationale for 
> attaching one to RelOptCluster? (Besides “it seems to work”?)
> 
> The natural lifespan of a RelMetadataQuery is a RelOptCall (because the 
> structure of the tree is constant for the duration of a call).
> 
> As for sub-classing RelMetadataQuery, I really don’t know. It’s probably 
> harmless, possibly disastrous, and it takes a lot of effort to determine 
> which. We have to be really cautious with RelMetadataQuery because it is one 
> of the few components in Calcite that mutate.
> 
> Julian
> 
> 
> 
> 
> > On Oct 15, 2019, at 1:42 PM, Haisheng Yuan  wrote:
> > 
> > I would like to bring this discussion up again.
> > 
> > We created our own metadata provider, and set it to RelMetadataQuery by 
> > calling
> > RelMetadataQuery.THREAD_PROVIDERS.set()
> > If we just use all the types of metadata provided by Calcite, it should be 
> > fine. But given that we have created new types of metadata, and we want to 
> > get them during rule transformation, we want to have the rule call return 
> > our own RelMetadataQuery. In order to achieve that, we sub-classed 
> > RelMetadataQuery, and sub-classed RelOptCluster to make metadata query 
> > overridable, and return the instance of subclass of RelMetadataQuery when 
> > the mq is null. Indeed, we can achieve that goal by just call 
> > rel.metadata(), but we want a efficient and convenient way to do that. That 
> > is why we want RelMetadataQuery in RelOptCluster customizable. 
> > Is it a good practice? If not, any other better approach to suggest?
> > Thanks.
> > Haisheng
> > 
> > --
> > 发件人:Stamatis Zampetakis
> > 日 期:2019年06月15日 15:47:58
> > 收件人:
> > 主 题:Re: [DISCUSSION] Extension of Metadata Query
> > 
> > I had a look on CALCITE-1147 (which added support for custom providers and
> > metadata) and based on the added tests [1], it seems that the main
> > alternative to RelMetadataQuery is indeed through the MetadataFactory. One
> > other alternative I can think of, is to pass a custom RelMetadataQuery (or
> > a query factory) through the context of the planner [2].
> > 
> > Given that we have already methods to obtain a RelMetadataQuery in
> > RelOptRuleCall and RelOptCluster it may make sense to allow the type of
> > query to be pluggable. However, I have not really used this mechanism in
> > the past so I cannot be sure what's the best way to proceed.
> > 
> > Best,
> > Stamatis
> > 
> > [1]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
> > [2]
> > https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256
> > 
> > On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:
> > 
> >> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> >> which I believe is the most common routine to query metadata during
> >> planning, I’m a little confused for your saying
> >> 
> >>> The methods in RelMetadataQuery are for convenience only.
> >> 
> >> If these methods in RelMetadataQuery are only for convenience, what is the
> >> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> >> ? (Users would never expect to use this because it’s based on Java
> >> Reflection and has performance problem)
> >> 
> >> [1]
> >> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
> >> 
> >> 

Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-15 Thread Haisheng Yuan
> Is there a good rationale for attaching one to RelOptCluster
Customized RelMetadataQuery with code generated meta handler for customized 
metadata, also can provide convenient way to get metadata.

- Haisheng

--
发件人:Julian Hyde
日 期:2019年10月16日 06:52:39
收件人:
主 题:Re: [DISCUSSION] Extension of Metadata Query

Having reviewed https://issues.apache.org/jira/browse/CALCITE-3403 
<https://issues.apache.org/jira/browse/CALCITE-3403> and 
https://issues.apache.org/jira/browse/CALCITE-2018 
<https://issues.apache.org/jira/browse/CALCITE-2018> today I am worried about 
the direction RelMetadataQuery is taking. Is there a good rationale for 
attaching one to RelOptCluster? (Besides “it seems to work”?)

The natural lifespan of a RelMetadataQuery is a RelOptCall (because the 
structure of the tree is constant for the duration of a call).

As for sub-classing RelMetadataQuery, I really don’t know. It’s probably 
harmless, possibly disastrous, and it takes a lot of effort to determine which. 
We have to be really cautious with RelMetadataQuery because it is one of the 
few components in Calcite that mutate.

Julian




> On Oct 15, 2019, at 1:42 PM, Haisheng Yuan  wrote:
> 
> I would like to bring this discussion up again.
> 
> We created our own metadata provider, and set it to RelMetadataQuery by 
> calling
> RelMetadataQuery.THREAD_PROVIDERS.set()
> If we just use all the types of metadata provided by Calcite, it should be 
> fine. But given that we have created new types of metadata, and we want to 
> get them during rule transformation, we want to have the rule call return our 
> own RelMetadataQuery. In order to achieve that, we sub-classed 
> RelMetadataQuery, and sub-classed RelOptCluster to make metadata query 
> overridable, and return the instance of subclass of RelMetadataQuery when the 
> mq is null. Indeed, we can achieve that goal by just call rel.metadata(), but 
> we want a efficient and convenient way to do that. That is why we want 
> RelMetadataQuery in RelOptCluster customizable. 
> Is it a good practice? If not, any other better approach to suggest?
> Thanks.
> Haisheng
> 
> ----------
> 发件人:Stamatis Zampetakis
> 日 期:2019年06月15日 15:47:58
> 收件人:
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> I had a look on CALCITE-1147 (which added support for custom providers and
> metadata) and based on the added tests [1], it seems that the main
> alternative to RelMetadataQuery is indeed through the MetadataFactory. One
> other alternative I can think of, is to pass a custom RelMetadataQuery (or
> a query factory) through the context of the planner [2].
> 
> Given that we have already methods to obtain a RelMetadataQuery in
> RelOptRuleCall and RelOptCluster it may make sense to allow the type of
> query to be pluggable. However, I have not really used this mechanism in
> the past so I cannot be sure what's the best way to proceed.
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
> [2]
> https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256
> 
> On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:
> 
>> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
>> which I believe is the most common routine to query metadata during
>> planning, I’m a little confused for your saying
>> 
>>> The methods in RelMetadataQuery are for convenience only.
>> 
>> If these methods in RelMetadataQuery are only for convenience, what is the
>> other way I can query a metadata ? The MedataFactory from the RelOptCluster
>> ? (Users would never expect to use this because it’s based on Java
>> Reflection and has performance problem)
>> 
>> [1]
>> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
>> 
>> Best,
>> Danny Chan
>> 在 2019年6月12日 +0800 AM6:27,Julian Hyde ,写道:
>>> The goal is that it should be possible to add a new kind of metadata.
>> The methods in RelMetadataQuery are for convenience only. So you should be
>> able to use your new kind of metadata, and existing ones, without modifying
>> RelMetadataQuery.
>>> 
>>> If that is not possible, it’s a bug, and you should log a JIRA case.
>>> 
>>> Given how the methods re-generate handlers (by catching exceptions and
>> modifying mutable private members) I can see that new kinds of

Re: [DISCUSSION] Extension of Metadata Query

2019-10-15 Thread Julian Hyde
Having reviewed https://issues.apache.org/jira/browse/CALCITE-3403 
<https://issues.apache.org/jira/browse/CALCITE-3403> and 
https://issues.apache.org/jira/browse/CALCITE-2018 
<https://issues.apache.org/jira/browse/CALCITE-2018> today I am worried about 
the direction RelMetadataQuery is taking. Is there a good rationale for 
attaching one to RelOptCluster? (Besides “it seems to work”?)

The natural lifespan of a RelMetadataQuery is a RelOptCall (because the 
structure of the tree is constant for the duration of a call).

As for sub-classing RelMetadataQuery, I really don’t know. It’s probably 
harmless, possibly disastrous, and it takes a lot of effort to determine which. 
We have to be really cautious with RelMetadataQuery because it is one of the 
few components in Calcite that mutate.

Julian




> On Oct 15, 2019, at 1:42 PM, Haisheng Yuan  wrote:
> 
> I would like to bring this discussion up again.
> 
> We created our own metadata provider, and set it to RelMetadataQuery by 
> calling
> RelMetadataQuery.THREAD_PROVIDERS.set()
> If we just use all the types of metadata provided by Calcite, it should be 
> fine. But given that we have created new types of metadata, and we want to 
> get them during rule transformation, we want to have the rule call return our 
> own RelMetadataQuery. In order to achieve that, we sub-classed 
> RelMetadataQuery, and sub-classed RelOptCluster to make metadata query 
> overridable, and return the instance of subclass of RelMetadataQuery when the 
> mq is null.  Indeed, we can achieve that goal by just call rel.metadata(), 
> but we want a efficient and convenient way to do that. That is why we want 
> RelMetadataQuery in RelOptCluster customizable. 
> Is it a good practice? If not, any other better approach to suggest?
> Thanks.
> Haisheng
> 
> ----------
> 发件人:Stamatis Zampetakis
> 日 期:2019年06月15日 15:47:58
> 收件人:
> 主 题:Re: [DISCUSSION] Extension of Metadata Query
> 
> I had a look on CALCITE-1147 (which added support for custom providers and
> metadata) and based on the added tests [1], it seems that the main
> alternative to RelMetadataQuery is indeed through the MetadataFactory. One
> other alternative I can think of, is to pass a custom RelMetadataQuery (or
> a query factory) through the context of the planner [2].
> 
> Given that we have already methods to obtain a RelMetadataQuery in
> RelOptRuleCall and RelOptCluster it may make sense to allow the type of
> query to be pluggable. However, I have not really used this mechanism in
> the past so I cannot be sure what's the best way to proceed.
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
> [2]
> https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256
> 
> On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:
> 
>> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
>> which I believe is the most common routine to query metadata during
>> planning, I’m a little confused for your saying
>> 
>>> The methods in RelMetadataQuery are for convenience only.
>> 
>> If these methods in RelMetadataQuery are only for convenience, what is the
>> other way I can query a metadata ? The MedataFactory from the RelOptCluster
>> ? (Users would never expect to use this because it’s based on Java
>> Reflection and has performance problem)
>> 
>> [1]
>> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
>> 
>> Best,
>> Danny Chan
>> 在 2019年6月12日 +0800 AM6:27,Julian Hyde ,写道:
>>> The goal is that it should be possible to add a new kind of metadata.
>> The methods in RelMetadataQuery are for convenience only. So you should be
>> able to use your new kind of metadata, and existing ones, without modifying
>> RelMetadataQuery.
>>> 
>>> If that is not possible, it’s a bug, and you should log a JIRA case.
>>> 
>>> Given how the methods re-generate handlers (by catching exceptions and
>> modifying mutable private members) I can see that new kinds of metadata
>> might be difficult to add.
>>> 
>>> Julian
>>> 
>>> 
>>>> On Jun 9, 2019, at 7:54 PM, Yuzhao Chen  wrote:
>>>> 
>>>> Thanks, Stamatis
>>>> 
>>>> You are right, this discussion came up due to CALCITE-2885, it is not
>> about the performance proble

Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-15 Thread Haisheng Yuan
I would like to bring this discussion up again.

We created our own metadata provider, and set it to RelMetadataQuery by calling
RelMetadataQuery.THREAD_PROVIDERS.set()
If we just use all the types of metadata provided by Calcite, it should be 
fine. But given that we have created new types of metadata, and we want to get 
them during rule transformation, we want to have the rule call return our own 
RelMetadataQuery. In order to achieve that, we sub-classed RelMetadataQuery, 
and sub-classed RelOptCluster to make metadata query overridable, and return 
the instance of subclass of RelMetadataQuery when the mq is null.  Indeed, we 
can achieve that goal by just call rel.metadata(), but we want a efficient and 
convenient way to do that. That is why we want RelMetadataQuery in 
RelOptCluster customizable. 
Is it a good practice? If not, any other better approach to suggest?
Thanks.
Haisheng

--
发件人:Stamatis Zampetakis
日 期:2019年06月15日 15:47:58
收件人:
主 题:Re: [DISCUSSION] Extension of Metadata Query

I had a look on CALCITE-1147 (which added support for custom providers and
metadata) and based on the added tests [1], it seems that the main
alternative to RelMetadataQuery is indeed through the MetadataFactory. One
other alternative I can think of, is to pass a custom RelMetadataQuery (or
a query factory) through the context of the planner [2].

Given that we have already methods to obtain a RelMetadataQuery in
RelOptRuleCall and RelOptCluster it may make sense to allow the type of
query to be pluggable. However, I have not really used this mechanism in
the past so I cannot be sure what's the best way to proceed.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
[2]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256

On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:

> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> which I believe is the most common routine to query metadata during
> planning, I’m a little confused for your saying
>
> >The methods in RelMetadataQuery are for convenience only.
>
> If these methods in RelMetadataQuery are only for convenience, what is the
> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> ? (Users would never expect to use this because it’s based on Java
> Reflection and has performance problem)
>
> [1]
> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
>
> Best,
> Danny Chan
> 在 2019年6月12日 +0800 AM6:27,Julian Hyde ,写道:
> > The goal is that it should be possible to add a new kind of metadata.
> The methods in RelMetadataQuery are for convenience only. So you should be
> able to use your new kind of metadata, and existing ones, without modifying
> RelMetadataQuery.
> >
> > If that is not possible, it’s a bug, and you should log a JIRA case.
> >
> > Given how the methods re-generate handlers (by catching exceptions and
> modifying mutable private members) I can see that new kinds of metadata
> might be difficult to add.
> >
> > Julian
> >
> >
> > > On Jun 9, 2019, at 7:54 PM, Yuzhao Chen  wrote:
> > >
> > > Thanks, Stamatis
> > >
> > > You are right, this discussion came up due to CALCITE-2885, it is not
> about the performance problem, it is the extension of RelMetadataQuery,
> because we add all kinds of top interfaces in RelMetadataQuery, e.g. the
> row count query [1].
> > >
> > > When we add a new RelMetadataProvider, a corresponding
> interface/method should be added into RelMetadataQuery, but for current
> RelOpeCluster impl, we could not do that(with a always default instance)
> > >
> > > As for the methods in RelMetadataProvider, i only saw the usage of
> #handlers in RelMetadataQuery, and saw the reference you pase, it seems not
> that relevant with this topic. What do you think ?
> > >
> > > [1]
> https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis ,写道:
> > > > Thanks for bringing this up Danny.
> > > >
> > > > I guess the discussion came up due to CALCITE-2885 [1]. Looking back
> into
> > > > it, I am not sure that the intention there is to make the
> RelMetadataQuery
> > > > pluggable. We co

Re: [DISCUSSION] Extension of Metadata Query

2019-06-15 Thread Stamatis Zampetakis
I had a look on CALCITE-1147 (which added support for custom providers and
metadata) and based on the added tests [1], it seems that the main
alternative to RelMetadataQuery is indeed through the MetadataFactory. One
other alternative I can think of, is to pass a custom RelMetadataQuery (or
a query factory) through the context of the planner [2].

Given that we have already methods to obtain a RelMetadataQuery in
RelOptRuleCall and RelOptCluster it may make sense to allow the type of
query to be pluggable. However, I have not really used this mechanism in
the past so I cannot be sure what's the best way to proceed.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L1001
[2]
https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L256

On Wed, Jun 12, 2019 at 6:07 AM Yuzhao Chen  wrote:

> We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1],
> which I believe is the most common routine to query metadata during
> planning, I’m a little confused for your saying
>
> >The methods in RelMetadataQuery are for convenience only.
>
> If these methods in RelMetadataQuery are only for convenience, what is the
> other way I can query a metadata ? The MedataFactory from the RelOptCluster
> ? (Users would never expect to use this because it’s based on Java
> Reflection and has performance problem)
>
> [1]
> https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200
>
> Best,
> Danny Chan
> 在 2019年6月12日 +0800 AM6:27,Julian Hyde ,写道:
> > The goal is that it should be possible to add a new kind of metadata.
> The methods in RelMetadataQuery are for convenience only. So you should be
> able to use your new kind of metadata, and existing ones, without modifying
> RelMetadataQuery.
> >
> > If that is not possible, it’s a bug, and you should log a JIRA case.
> >
> > Given how the methods re-generate handlers (by catching exceptions and
> modifying mutable private members) I can see that new kinds of metadata
> might be difficult to add.
> >
> > Julian
> >
> >
> > > On Jun 9, 2019, at 7:54 PM, Yuzhao Chen  wrote:
> > >
> > > Thanks, Stamatis
> > >
> > > You are right, this discussion came up due to CALCITE-2885, it is not
> about the performance problem, it is the extension of RelMetadataQuery,
> because we add all kinds of top interfaces in RelMetadataQuery, e.g. the
> row count query [1].
> > >
> > > When we add a new RelMetadataProvider, a corresponding
> interface/method should be added into RelMetadataQuery, but for current
> RelOpeCluster impl, we could not do that(with a always default instance)
> > >
> > > As for the methods in RelMetadataProvider, i only saw the usage of
> #handlers in RelMetadataQuery, and saw the reference you pase, it seems not
> that relevant with this topic. What do you think ?
> > >
> > > [1]
> https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis ,写道:
> > > > Thanks for bringing this up Danny.
> > > >
> > > > I guess the discussion came up due to CALCITE-2885 [1]. Looking back
> into
> > > > it, I am not sure that the intention there is to make the
> RelMetadataQuery
> > > > pluggable. We could possibly solve the performance problem without
> > > > extending the RelMetadataQuery. We have to look again into this case.
> > > >
> > > > For more details regarding the existence of the two methods in
> > > > RelMetadataProvider have a look in CALCITE-604 [2]. More general for
> the
> > > > design of RelMetadataProvider you may find useful the description in
> [3].
> > > >
> > > > Best,
> > > > Stamatis
> > > >
> > > > [1] https://issues.apache.org/jira/browse/CALCITE-2885
> > > > [2] https://issues.apache.org/jira/browse/CALCITE-604
> > > > [3]
> > > >
> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
> > > >
> > > > On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:
> > > >
> > > > > > 1. Why we have 2 methods in RelMetadataProvider?
> > > > >
> > > > > The metadata system is complicated. We need to allow multiple
> handlers
> > > > > for any given call. So, making a metadata call involves multiple
> > > > > dispatch [1] based on the metadata method being called, the type of
> > > > > RelNode, and the handlers that are registered. Also it needs to
> cache
> > > > > results, and detect cycles. And the dispatch needs to be
> efficient, so
> > > > > we generate janino code to do the dispatch, and re-generate when
> new
> > > > > handlers or sub-classes of RelNode are added.
> > > > >
> > > > > I forget details, the two methods are basically required to 

Re: [DISCUSSION] Extension of Metadata Query

2019-06-11 Thread Yuzhao Chen
We have an interface to fetch the RelMetadataQuery in RelOptRuleCall [1], which 
I believe is the most common routine to query metadata during planning, I’m a 
little confused for your saying

>The methods in RelMetadataQuery are for convenience only.

If these methods in RelMetadataQuery are only for convenience, what is the 
other way I can query a metadata ? The MedataFactory from the RelOptCluster ? 
(Users would never expect to use this because it’s based on Java Reflection and 
has performance problem)

[1] 
https://github.com/apache/calcite/blob/a3c56be7bccc58859524ba39e5b30b7078f97d00/core/src/main/java/org/apache/calcite/plan/RelOptRuleCall.java#L200

Best,
Danny Chan
在 2019年6月12日 +0800 AM6:27,Julian Hyde ,写道:
> The goal is that it should be possible to add a new kind of metadata. The 
> methods in RelMetadataQuery are for convenience only. So you should be able 
> to use your new kind of metadata, and existing ones, without modifying 
> RelMetadataQuery.
>
> If that is not possible, it’s a bug, and you should log a JIRA case.
>
> Given how the methods re-generate handlers (by catching exceptions and 
> modifying mutable private members) I can see that new kinds of metadata might 
> be difficult to add.
>
> Julian
>
>
> > On Jun 9, 2019, at 7:54 PM, Yuzhao Chen  wrote:
> >
> > Thanks, Stamatis
> >
> > You are right, this discussion came up due to CALCITE-2885, it is not about 
> > the performance problem, it is the extension of RelMetadataQuery, because 
> > we add all kinds of top interfaces in RelMetadataQuery, e.g. the row count 
> > query [1].
> >
> > When we add a new RelMetadataProvider, a corresponding interface/method 
> > should be added into RelMetadataQuery, but for current RelOpeCluster impl, 
> > we could not do that(with a always default instance)
> >
> > As for the methods in RelMetadataProvider, i only saw the usage of 
> > #handlers in RelMetadataQuery, and saw the reference you pase, it seems not 
> > that relevant with this topic. What do you think ?
> >
> > [1] 
> > https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> >
> > Best,
> > Danny Chan
> > 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis ,写道:
> > > Thanks for bringing this up Danny.
> > >
> > > I guess the discussion came up due to CALCITE-2885 [1]. Looking back into
> > > it, I am not sure that the intention there is to make the RelMetadataQuery
> > > pluggable. We could possibly solve the performance problem without
> > > extending the RelMetadataQuery. We have to look again into this case.
> > >
> > > For more details regarding the existence of the two methods in
> > > RelMetadataProvider have a look in CALCITE-604 [2]. More general for the
> > > design of RelMetadataProvider you may find useful the description in [3].
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1] https://issues.apache.org/jira/browse/CALCITE-2885
> > > [2] https://issues.apache.org/jira/browse/CALCITE-604
> > > [3]
> > > https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
> > >
> > > On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:
> > >
> > > > > 1. Why we have 2 methods in RelMetadataProvider?
> > > >
> > > > The metadata system is complicated. We need to allow multiple handlers
> > > > for any given call. So, making a metadata call involves multiple
> > > > dispatch [1] based on the metadata method being called, the type of
> > > > RelNode, and the handlers that are registered. Also it needs to cache
> > > > results, and detect cycles. And the dispatch needs to be efficient, so
> > > > we generate janino code to do the dispatch, and re-generate when new
> > > > handlers or sub-classes of RelNode are added.
> > > >
> > > > I forget details, the two methods are basically required to allow us
> > > > to generate code to do the dispatch.
> > > >
> > > > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> > > >
> > > > I disagree. RelMetadataQuery is not an extension point. Its sole
> > > > purpose is to to keep state (the cache and cycle-checking).
> > > > RelMetadataProvider is the extension point. (By analogy, if you are
> > > > un-parsing an AST, you let each AST sub-class handle unparsing itself,
> > > > but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> > > > is in the role of the StringBuilder. In a complex system, it is nice
> > > > to keep some of the components simple, or at least keep them to
> > > > prescribed roles.)
> > > >
> > > > Julian
> > > >
> > > > [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> > > >
> > > > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  
> > > > wrote:
> > > > >
> > > > > Currently we provide answer to metadata query through
> > > > RelMetadataProvider [1], there are some sub-classes of it:
> > > > >
> > > > > RelMetadataProvider
> > > > > |
> > > > > |- VolcanoRelMetadataProvider
> > > > > |- 

Re: [DISCUSSION] Extension of Metadata Query

2019-06-11 Thread Julian Hyde
The goal is that it should be possible to add a new kind of metadata. The 
methods in RelMetadataQuery are for convenience only. So you should be able to 
use your new kind of metadata, and existing ones, without modifying 
RelMetadataQuery.

If that is not possible, it’s a bug, and you should log a JIRA case.

Given how the methods re-generate handlers (by catching exceptions and 
modifying mutable private members) I can see that new kinds of metadata might 
be difficult to add.

Julian
 

> On Jun 9, 2019, at 7:54 PM, Yuzhao Chen  wrote:
> 
> Thanks, Stamatis
> 
> You are right, this discussion came up due to CALCITE-2885, it is not about 
> the performance problem, it is the extension of RelMetadataQuery, because we 
> add all kinds of top interfaces in RelMetadataQuery, e.g. the row count query 
> [1].
> 
> When we add a new RelMetadataProvider, a corresponding interface/method 
> should be added into RelMetadataQuery, but for current RelOpeCluster impl, we 
> could not do that(with a always default instance)
> 
> As for the  methods in RelMetadataProvider, i only saw the usage of #handlers 
> in RelMetadataQuery, and saw the reference you pase, it seems not that 
> relevant with this topic. What do you think ?
> 
> [1] 
> https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222
> 
> Best,
> Danny Chan
> 在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis ,写道:
>> Thanks for bringing this up Danny.
>> 
>> I guess the discussion came up due to CALCITE-2885 [1]. Looking back into
>> it, I am not sure that the intention there is to make the RelMetadataQuery
>> pluggable. We could possibly solve the performance problem without
>> extending the RelMetadataQuery. We have to look again into this case.
>> 
>> For more details regarding the existence of the two methods in
>> RelMetadataProvider have a look in CALCITE-604 [2]. More general for the
>> design of RelMetadataProvider you may find useful the description in [3].
>> 
>> Best,
>> Stamatis
>> 
>> [1] https://issues.apache.org/jira/browse/CALCITE-2885
>> [2] https://issues.apache.org/jira/browse/CALCITE-604
>> [3]
>> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
>> 
>> On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:
>> 
 1. Why we have 2 methods in RelMetadataProvider?
>>> 
>>> The metadata system is complicated. We need to allow multiple handlers
>>> for any given call. So, making a metadata call involves multiple
>>> dispatch [1] based on the metadata method being called, the type of
>>> RelNode, and the handlers that are registered. Also it needs to cache
>>> results, and detect cycles. And the dispatch needs to be efficient, so
>>> we generate janino code to do the dispatch, and re-generate when new
>>> handlers or sub-classes of RelNode are added.
>>> 
>>> I forget details, the two methods are basically required to allow us
>>> to generate code to do the dispatch.
>>> 
 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>>> 
>>> I disagree. RelMetadataQuery is not an extension point. Its sole
>>> purpose is to to keep state (the cache and cycle-checking).
>>> RelMetadataProvider is the extension point. (By analogy, if you are
>>> un-parsing an AST, you let each AST sub-class handle unparsing itself,
>>> but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
>>> is in the role of the StringBuilder. In a complex system, it is nice
>>> to keep some of the components simple, or at least keep them to
>>> prescribed roles.)
>>> 
>>> Julian
>>> 
>>> [1] https://en.wikipedia.org/wiki/Multiple_dispatch
>>> 
>>> On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
 
 Currently we provide answer to metadata query through
>>> RelMetadataProvider [1], there are some sub-classes of it:
 
 RelMetadataProvider
 |
 |- VolcanoRelMetadataProvider
 |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
 |- HepRelMetadataProvider
 |- CachingRelMetadataProvider
 |- ReflectiveRelMetadataProvider
 |- JaninoRelMetadataProvider
 
 The RelMetadataProvider has two methods: #apply and #handlers, the
>>> #apply method seems a programming interface and there is a demo code how we
>>> can use it:
 
 RelMetadataProvider provider;
 LogicalFilter filter;
 RexNode predicate;
 Function function =
 provider.apply(LogicalFilter.class, Selectivity.class};
 Selectivity selectivity = function.apply(filter);
 Double d = selectivity.selectivity(predicate);
 
 But let's see our RelOptCluster's member variables[2], there are
>>> MetadataFactory and RelMetadataQuery which all can be used to query the
>>> metadata, for MetadataFactory, there is a default impl named
>>> MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally,
>>> for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally

Re: [DISCUSSION] Extension of Metadata Query

2019-06-09 Thread Yuzhao Chen
Thanks, Stamatis

You are right, this discussion came up due to CALCITE-2885, it is not about the 
performance problem, it is the extension of RelMetadataQuery, because we add 
all kinds of top interfaces in RelMetadataQuery, e.g. the row count query [1].

When we add a new RelMetadataProvider, a corresponding interface/method should 
be added into RelMetadataQuery, but for current RelOpeCluster impl, we could 
not do that(with a always default instance)

As for the  methods in RelMetadataProvider, i only saw the usage of #handlers 
in RelMetadataQuery, and saw the reference you pase, it seems not that relevant 
with this topic. What do you think ?

[1] 
https://github.com/apache/calcite/blob/941cd4e9540e3ef9b7c15daee42831a0c63da8b9/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L222

Best,
Danny Chan
在 2019年6月5日 +0800 AM6:44,Stamatis Zampetakis ,写道:
> Thanks for bringing this up Danny.
>
> I guess the discussion came up due to CALCITE-2885 [1]. Looking back into
> it, I am not sure that the intention there is to make the RelMetadataQuery
> pluggable. We could possibly solve the performance problem without
> extending the RelMetadataQuery. We have to look again into this case.
>
> For more details regarding the existence of the two methods in
> RelMetadataProvider have a look in CALCITE-604 [2]. More general for the
> design of RelMetadataProvider you may find useful the description in [3].
>
> Best,
> Stamatis
>
> [1] https://issues.apache.org/jira/browse/CALCITE-2885
> [2] https://issues.apache.org/jira/browse/CALCITE-604
> [3]
> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
>
> On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:
>
> > > 1. Why we have 2 methods in RelMetadataProvider?
> >
> > The metadata system is complicated. We need to allow multiple handlers
> > for any given call. So, making a metadata call involves multiple
> > dispatch [1] based on the metadata method being called, the type of
> > RelNode, and the handlers that are registered. Also it needs to cache
> > results, and detect cycles. And the dispatch needs to be efficient, so
> > we generate janino code to do the dispatch, and re-generate when new
> > handlers or sub-classes of RelNode are added.
> >
> > I forget details, the two methods are basically required to allow us
> > to generate code to do the dispatch.
> >
> > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> >
> > I disagree. RelMetadataQuery is not an extension point. Its sole
> > purpose is to to keep state (the cache and cycle-checking).
> > RelMetadataProvider is the extension point. (By analogy, if you are
> > un-parsing an AST, you let each AST sub-class handle unparsing itself,
> > but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> > is in the role of the StringBuilder. In a complex system, it is nice
> > to keep some of the components simple, or at least keep them to
> > prescribed roles.)
> >
> > Julian
> >
> > [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> >
> > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
> > >
> > > Currently we provide answer to metadata query through
> > RelMetadataProvider [1], there are some sub-classes of it:
> > >
> > > RelMetadataProvider
> > > |
> > > |- VolcanoRelMetadataProvider
> > > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > > |- HepRelMetadataProvider
> > > |- CachingRelMetadataProvider
> > > |- ReflectiveRelMetadataProvider
> > > |- JaninoRelMetadataProvider
> > >
> > > The RelMetadataProvider has two methods: #apply and #handlers, the
> > #apply method seems a programming interface and there is a demo code how we
> > can use it:
> > >
> > > RelMetadataProvider provider;
> > > LogicalFilter filter;
> > > RexNode predicate;
> > > Function function =
> > > provider.apply(LogicalFilter.class, Selectivity.class};
> > > Selectivity selectivity = function.apply(filter);
> > > Double d = selectivity.selectivity(predicate);
> > >
> > > But let's see our RelOptCluster's member variables[2], there are
> > MetadataFactory and RelMetadataQuery which all can be used to query the
> > metadata, for MetadataFactory, there is a default impl named
> > MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally,
> > for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally
> > composed and codeden by JaninoRelMetadataProvider).
> > >
> > > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to
> > get the MQ and query the metadata.
> > >
> > > For extension of metadata handlers, we can set our customized
> > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we have
> > no way to extend it now, because the RelOptCluster always has a singleton
> > instance [4] which is only the default implementation.
> > >
> > >
> > > My question is as follows:
> > >
> > > 1. Why we have 2 methods in RelMetadataProvider, and why we need the
> > 

Re: [DISCUSSION] Extension of Metadata Query

2019-06-05 Thread Yuzhao Chen
In order to add some top-level query interfaces like columnInterval to topK, 
one way to extend the RelMetadataQuery is “sub-class” it,  but a 
subclass-instance of RelMetadataQuery can not be set into the 
RelOptCluster(with a always default singleton RMQ instance), which is a key 
factor to limit the extension of RelMetadataQuery.

Best,
Danny Chan
在 2019年6月5日 +0800 PM11:49,Julian Hyde ,写道:
> By “extend” do you mean “sub-class”?
>
> > On Jun 4, 2019, at 10:19 PM, Yuzhao Chen  wrote:
> >
> > Thanks Julian for your detail reply,
> >
> >
> > 1. I checked the Janino gened-code and the 
> > RelMetadataQuery/RelMetadataProvidor and almost can make sure MetadataQuery 
> > only use the RelMetadataProvidor#handlers method for multiple dispatch, the 
> > RelMetadataProvidor#apply is only used for MetadataFactory.
> > 2. I agree that we should provide new RelMetadataProvider for extension, 
> > but the RelMetadataQuery has top level interfaces for metadata query, these 
> > top level interfaces should be extendable and sync with our metadata 
> > handlers.
> >
> >
> > Best,
> > Danny Chan
> > 在 2019年6月5日 +0800 AM1:48,Julian Hyde ,写道:
> > > > 1. Why we have 2 methods in RelMetadataProvider?
> > >
> > > The metadata system is complicated. We need to allow multiple handlers
> > > for any given call. So, making a metadata call involves multiple
> > > dispatch [1] based on the metadata method being called, the type of
> > > RelNode, and the handlers that are registered. Also it needs to cache
> > > results, and detect cycles. And the dispatch needs to be efficient, so
> > > we generate janino code to do the dispatch, and re-generate when new
> > > handlers or sub-classes of RelNode are added.
> > >
> > > I forget details, the two methods are basically required to allow us
> > > to generate code to do the dispatch.
> > >
> > > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> > >
> > > I disagree. RelMetadataQuery is not an extension point. Its sole
> > > purpose is to to keep state (the cache and cycle-checking).
> > > RelMetadataProvider is the extension point. (By analogy, if you are
> > > un-parsing an AST, you let each AST sub-class handle unparsing itself,
> > > but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> > > is in the role of the StringBuilder. In a complex system, it is nice
> > > to keep some of the components simple, or at least keep them to
> > > prescribed roles.)
> > >
> > > Julian
> > >
> > > [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> > >
> > > > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  
> > > > wrote:
> > > >
> > > > Currently we provide answer to metadata query through 
> > > > RelMetadataProvider [1], there are some sub-classes of it:
> > > >
> > > > RelMetadataProvider
> > > > |
> > > > |- VolcanoRelMetadataProvider
> > > > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > > > |- HepRelMetadataProvider
> > > > |- CachingRelMetadataProvider
> > > > |- ReflectiveRelMetadataProvider
> > > > |- JaninoRelMetadataProvider
> > > >
> > > > The RelMetadataProvider has two methods: #apply and #handlers, the 
> > > > #apply method seems a programming interface and there is a demo code 
> > > > how we can use it:
> > > >
> > > > RelMetadataProvider provider;
> > > > LogicalFilter filter;
> > > > RexNode predicate;
> > > > Function function =
> > > > provider.apply(LogicalFilter.class, Selectivity.class};
> > > > Selectivity selectivity = function.apply(filter);
> > > > Double d = selectivity.selectivity(predicate);
> > > >
> > > > But let's see our RelOptCluster's member variables[2], there are 
> > > > MetadataFactory and RelMetadataQuery which all can be used to query the 
> > > > metadata, for MetadataFactory, there is a default impl named 
> > > > MetadataFactoryImpl which will invoke RelMetadataProvider#apply 
> > > > internally, for RelMetadataQuery, it will invoke 
> > > > RelMetadataProvider#handlers (finally composed and codeden by 
> > > > JaninoRelMetadataProvider).
> > > >
> > > > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery 
> > > > to get the MQ and query the metadata.
> > > >
> > > > For extension of metadata handlers, we can set our customized 
> > > > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we 
> > > > have no way to extend it now, because the RelOptCluster always has a 
> > > > singleton instance [4] which is only the default implementation.
> > > >
> > > >
> > > > My question is as follows:
> > > >
> > > > 1. Why we have 2 methods in RelMetadataProvider, and why we need the 
> > > > MetadataFactory and RelMetadataProvider#apply ? It seems that it's 
> > > > function is already been overriden by RelMetadataQuery(The difference 
> > > > is that MetadataFactory use Reflection and RelMetadataQuery use gened 
> > > > bytes code).
> > > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> > > >
> > > >
> > > > [1] 
> > > > 

Re: [DISCUSSION] Extension of Metadata Query

2019-06-05 Thread Julian Hyde
By “extend” do you mean “sub-class”?

> On Jun 4, 2019, at 10:19 PM, Yuzhao Chen  wrote:
> 
> Thanks Julian for your detail reply,
> 
> 
> 1. I checked the Janino gened-code and the 
> RelMetadataQuery/RelMetadataProvidor and almost can make sure MetadataQuery 
> only use the RelMetadataProvidor#handlers method for multiple dispatch, the 
> RelMetadataProvidor#apply is only used for MetadataFactory.
> 2. I agree that we should provide new RelMetadataProvider for extension, but 
> the RelMetadataQuery has top level interfaces for metadata query, these top 
> level interfaces should be extendable and sync with our metadata handlers.
> 
> 
> Best,
> Danny Chan
> 在 2019年6月5日 +0800 AM1:48,Julian Hyde ,写道:
>>> 1. Why we have 2 methods in RelMetadataProvider?
>> 
>> The metadata system is complicated. We need to allow multiple handlers
>> for any given call. So, making a metadata call involves multiple
>> dispatch [1] based on the metadata method being called, the type of
>> RelNode, and the handlers that are registered. Also it needs to cache
>> results, and detect cycles. And the dispatch needs to be efficient, so
>> we generate janino code to do the dispatch, and re-generate when new
>> handlers or sub-classes of RelNode are added.
>> 
>> I forget details, the two methods are basically required to allow us
>> to generate code to do the dispatch.
>> 
>>> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>> 
>> I disagree. RelMetadataQuery is not an extension point. Its sole
>> purpose is to to keep state (the cache and cycle-checking).
>> RelMetadataProvider is the extension point. (By analogy, if you are
>> un-parsing an AST, you let each AST sub-class handle unparsing itself,
>> but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
>> is in the role of the StringBuilder. In a complex system, it is nice
>> to keep some of the components simple, or at least keep them to
>> prescribed roles.)
>> 
>> Julian
>> 
>> [1] https://en.wikipedia.org/wiki/Multiple_dispatch
>> 
>>> On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
>>> 
>>> Currently we provide answer to metadata query through RelMetadataProvider 
>>> [1], there are some sub-classes of it:
>>> 
>>> RelMetadataProvider
>>> |
>>> |- VolcanoRelMetadataProvider
>>> |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
>>> |- HepRelMetadataProvider
>>> |- CachingRelMetadataProvider
>>> |- ReflectiveRelMetadataProvider
>>> |- JaninoRelMetadataProvider
>>> 
>>> The RelMetadataProvider has two methods: #apply and #handlers, the #apply 
>>> method seems a programming interface and there is a demo code how we can 
>>> use it:
>>> 
>>> RelMetadataProvider provider;
>>> LogicalFilter filter;
>>> RexNode predicate;
>>> Function function =
>>> provider.apply(LogicalFilter.class, Selectivity.class};
>>> Selectivity selectivity = function.apply(filter);
>>> Double d = selectivity.selectivity(predicate);
>>> 
>>> But let's see our RelOptCluster's member variables[2], there are 
>>> MetadataFactory and RelMetadataQuery which all can be used to query the 
>>> metadata, for MetadataFactory, there is a default impl named 
>>> MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally, 
>>> for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally 
>>> composed and codeden by JaninoRelMetadataProvider).
>>> 
>>> In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to 
>>> get the MQ and query the metadata.
>>> 
>>> For extension of metadata handlers, we can set our customized 
>>> RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we have 
>>> no way to extend it now, because the RelOptCluster always has a singleton 
>>> instance [4] which is only the default implementation.
>>> 
>>> 
>>> My question is as follows:
>>> 
>>> 1. Why we have 2 methods in RelMetadataProvider, and why we need the 
>>> MetadataFactory and RelMetadataProvider#apply ? It seems that it's function 
>>> is already been overriden by RelMetadataQuery(The difference is that 
>>> MetadataFactory use Reflection and RelMetadataQuery use gened bytes code).
>>> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>>> 
>>> 
>>> [1] 
>>> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
>>> [2] 
>>> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
>>> [3] 
>>> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
>>> [4] 
>>> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
>>> 
>>> 
>>> 
>>> Best,
>>> Danny Chan


Re: [DISCUSSION] Extension of Metadata Query

2019-06-04 Thread Yuzhao Chen
Thanks Julian for your detail reply,


1. I checked the Janino gened-code and the RelMetadataQuery/RelMetadataProvidor 
and almost can make sure MetadataQuery only use the 
RelMetadataProvidor#handlers method for multiple dispatch, the 
RelMetadataProvidor#apply is only used for MetadataFactory.
2. I agree that we should provide new RelMetadataProvider for extension, but 
the RelMetadataQuery has top level interfaces for metadata query, these top 
level interfaces should be extendable and sync with our metadata handlers.


Best,
Danny Chan
在 2019年6月5日 +0800 AM1:48,Julian Hyde ,写道:
> > 1. Why we have 2 methods in RelMetadataProvider?
>
> The metadata system is complicated. We need to allow multiple handlers
> for any given call. So, making a metadata call involves multiple
> dispatch [1] based on the metadata method being called, the type of
> RelNode, and the handlers that are registered. Also it needs to cache
> results, and detect cycles. And the dispatch needs to be efficient, so
> we generate janino code to do the dispatch, and re-generate when new
> handlers or sub-classes of RelNode are added.
>
> I forget details, the two methods are basically required to allow us
> to generate code to do the dispatch.
>
> > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>
> I disagree. RelMetadataQuery is not an extension point. Its sole
> purpose is to to keep state (the cache and cycle-checking).
> RelMetadataProvider is the extension point. (By analogy, if you are
> un-parsing an AST, you let each AST sub-class handle unparsing itself,
> but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> is in the role of the StringBuilder. In a complex system, it is nice
> to keep some of the components simple, or at least keep them to
> prescribed roles.)
>
> Julian
>
> [1] https://en.wikipedia.org/wiki/Multiple_dispatch
>
> On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
> >
> > Currently we provide answer to metadata query through RelMetadataProvider 
> > [1], there are some sub-classes of it:
> >
> > RelMetadataProvider
> > |
> > |- VolcanoRelMetadataProvider
> > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > |- HepRelMetadataProvider
> > |- CachingRelMetadataProvider
> > |- ReflectiveRelMetadataProvider
> > |- JaninoRelMetadataProvider
> >
> > The RelMetadataProvider has two methods: #apply and #handlers, the #apply 
> > method seems a programming interface and there is a demo code how we can 
> > use it:
> >
> > RelMetadataProvider provider;
> > LogicalFilter filter;
> > RexNode predicate;
> > Function function =
> > provider.apply(LogicalFilter.class, Selectivity.class};
> > Selectivity selectivity = function.apply(filter);
> > Double d = selectivity.selectivity(predicate);
> >
> > But let's see our RelOptCluster's member variables[2], there are 
> > MetadataFactory and RelMetadataQuery which all can be used to query the 
> > metadata, for MetadataFactory, there is a default impl named 
> > MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally, 
> > for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally 
> > composed and codeden by JaninoRelMetadataProvider).
> >
> > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to 
> > get the MQ and query the metadata.
> >
> > For extension of metadata handlers, we can set our customized 
> > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we have 
> > no way to extend it now, because the RelOptCluster always has a singleton 
> > instance [4] which is only the default implementation.
> >
> >
> > My question is as follows:
> >
> > 1. Why we have 2 methods in RelMetadataProvider, and why we need the 
> > MetadataFactory and RelMetadataProvider#apply ? It seems that it's function 
> > is already been overriden by RelMetadataQuery(The difference is that 
> > MetadataFactory use Reflection and RelMetadataQuery use gened bytes code).
> > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> >
> >
> > [1] 
> > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> > [2] 
> > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
> > [3] 
> > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
> > [4] 
> > https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
> >
> >
> >
> > Best,
> > Danny Chan


Re: [DISCUSSION] Extension of Metadata Query

2019-06-04 Thread Chunwei Lei
Thanks for raising this, Danny.  Actually I have the same question too.

> RelMetadataQuery is not an extension point. Its sole purpose is to to
keep state (the cache and cycle-checking)
I think users may extend the RelMetadataQuery if wanting to query more
metadata, such as TopK values.


Best,
Chunwei


On Wed, Jun 5, 2019 at 6:44 AM Stamatis Zampetakis 
wrote:

> Thanks for bringing this up Danny.
>
> I guess the discussion came up due to CALCITE-2885 [1]. Looking back into
> it, I am not sure that the intention there is to make the RelMetadataQuery
> pluggable. We could possibly solve the performance problem without
> extending the RelMetadataQuery. We have to look again into this case.
>
> For more details regarding the existence of the two methods in
> RelMetadataProvider have a look in CALCITE-604 [2]. More general for the
> design of RelMetadataProvider you may find useful the description in [3].
>
> Best,
> Stamatis
>
> [1] https://issues.apache.org/jira/browse/CALCITE-2885
> [2] https://issues.apache.org/jira/browse/CALCITE-604
> [3]
>
> https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/
>
> On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:
>
> > > 1. Why we have 2 methods in RelMetadataProvider?
> >
> > The metadata system is complicated. We need to allow multiple handlers
> > for any given call. So, making a metadata call involves multiple
> > dispatch [1] based on the metadata method being called, the type of
> > RelNode, and the handlers that are registered. Also it needs to cache
> > results, and detect cycles. And the dispatch needs to be efficient, so
> > we generate janino code to do the dispatch, and re-generate when new
> > handlers or sub-classes of RelNode are added.
> >
> > I forget details, the two methods are basically required to allow us
> > to generate code to do the dispatch.
> >
> > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> >
> > I disagree. RelMetadataQuery is not an extension point. Its sole
> > purpose is to to keep state (the cache and cycle-checking).
> > RelMetadataProvider is the extension point. (By analogy, if you are
> > un-parsing an AST, you let each AST sub-class handle unparsing itself,
> > but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> > is in the role of the StringBuilder. In a complex system, it is nice
> > to keep some of the components simple, or at least keep them to
> > prescribed roles.)
> >
> > Julian
> >
> > [1] https://en.wikipedia.org/wiki/Multiple_dispatch
> >
> > On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen 
> wrote:
> > >
> > > Currently we provide answer to metadata query through
> > RelMetadataProvider [1], there are some sub-classes of it:
> > >
> > > RelMetadataProvider
> > > |
> > > |- VolcanoRelMetadataProvider
> > > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > > |- HepRelMetadataProvider
> > > |- CachingRelMetadataProvider
> > > |- ReflectiveRelMetadataProvider
> > > |- JaninoRelMetadataProvider
> > >
> > > The RelMetadataProvider has two methods: #apply and #handlers, the
> > #apply method seems a programming interface and there is a demo code how
> we
> > can use it:
> > >
> > > RelMetadataProvider provider;
> > > LogicalFilter filter;
> > > RexNode predicate;
> > > Function function =
> > > provider.apply(LogicalFilter.class, Selectivity.class};
> > > Selectivity selectivity = function.apply(filter);
> > > Double d = selectivity.selectivity(predicate);
> > >
> > > But let's see our RelOptCluster's member variables[2], there are
> > MetadataFactory and RelMetadataQuery which all can be used to query the
> > metadata, for MetadataFactory, there is a default impl named
> > MetadataFactoryImpl which will invoke RelMetadataProvider#apply
> internally,
> > for RelMetadataQuery, it will invoke RelMetadataProvider#handlers
> (finally
> > composed and codeden by JaninoRelMetadataProvider).
> > >
> > > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery
> to
> > get the MQ and query the metadata.
> > >
> > > For extension of metadata handlers, we can set our customized
> > RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we
> have
> > no way to extend it now, because the RelOptCluster always has a singleton
> > instance [4] which is only the default implementation.
> > >
> > >
> > > My question is as follows:
> > >
> > > 1. Why we have 2 methods in RelMetadataProvider, and why we need the
> > MetadataFactory and RelMetadataProvider#apply ? It seems that it's
> function
> > is already been overriden by RelMetadataQuery(The difference is that
> > MetadataFactory use Reflection and RelMetadataQuery use gened bytes
> code).
> > > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> > >
> > >
> > > [1]
> >
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> > > [2]
> >
> 

Re: [DISCUSSION] Extension of Metadata Query

2019-06-04 Thread Stamatis Zampetakis
Thanks for bringing this up Danny.

I guess the discussion came up due to CALCITE-2885 [1]. Looking back into
it, I am not sure that the intention there is to make the RelMetadataQuery
pluggable. We could possibly solve the performance problem without
extending the RelMetadataQuery. We have to look again into this case.

For more details regarding the existence of the two methods in
RelMetadataProvider have a look in CALCITE-604 [2]. More general for the
design of RelMetadataProvider you may find useful the description in [3].

Best,
Stamatis

[1] https://issues.apache.org/jira/browse/CALCITE-2885
[2] https://issues.apache.org/jira/browse/CALCITE-604
[3]
https://web.archive.org/web/20140624040836/www.hydromatic.net/wiki/RelationalExpressionMetadata/

On Tue, Jun 4, 2019 at 7:48 PM Julian Hyde  wrote:

> > 1. Why we have 2 methods in RelMetadataProvider?
>
> The metadata system is complicated. We need to allow multiple handlers
> for any given call. So, making a metadata call involves multiple
> dispatch [1] based on the metadata method being called, the type of
> RelNode, and the handlers that are registered. Also it needs to cache
> results, and detect cycles. And the dispatch needs to be efficient, so
> we generate janino code to do the dispatch, and re-generate when new
> handlers or sub-classes of RelNode are added.
>
> I forget details, the two methods are basically required to allow us
> to generate code to do the dispatch.
>
> > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>
> I disagree. RelMetadataQuery is not an extension point. Its sole
> purpose is to to keep state (the cache and cycle-checking).
> RelMetadataProvider is the extension point. (By analogy, if you are
> un-parsing an AST, you let each AST sub-class handle unparsing itself,
> but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
> is in the role of the StringBuilder. In a complex system, it is nice
> to keep some of the components simple, or at least keep them to
> prescribed roles.)
>
> Julian
>
> [1] https://en.wikipedia.org/wiki/Multiple_dispatch
>
> On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
> >
> > Currently we provide answer to metadata query through
> RelMetadataProvider [1], there are some sub-classes of it:
> >
> > RelMetadataProvider
> > |
> > |- VolcanoRelMetadataProvider
> > |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> > |- HepRelMetadataProvider
> > |- CachingRelMetadataProvider
> > |- ReflectiveRelMetadataProvider
> > |- JaninoRelMetadataProvider
> >
> > The RelMetadataProvider has two methods: #apply and #handlers, the
> #apply method seems a programming interface and there is a demo code how we
> can use it:
> >
> > RelMetadataProvider provider;
> > LogicalFilter filter;
> > RexNode predicate;
> > Function function =
> > provider.apply(LogicalFilter.class, Selectivity.class};
> > Selectivity selectivity = function.apply(filter);
> > Double d = selectivity.selectivity(predicate);
> >
> > But let's see our RelOptCluster's member variables[2], there are
> MetadataFactory and RelMetadataQuery which all can be used to query the
> metadata, for MetadataFactory, there is a default impl named
> MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally,
> for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally
> composed and codeden by JaninoRelMetadataProvider).
> >
> > In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to
> get the MQ and query the metadata.
> >
> > For extension of metadata handlers, we can set our customized
> RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we have
> no way to extend it now, because the RelOptCluster always has a singleton
> instance [4] which is only the default implementation.
> >
> >
> > My question is as follows:
> >
> > 1. Why we have 2 methods in RelMetadataProvider, and why we need the
> MetadataFactory and RelMetadataProvider#apply ? It seems that it's function
> is already been overriden by RelMetadataQuery(The difference is that
> MetadataFactory use Reflection and RelMetadataQuery use gened bytes code).
> > 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> >
> >
> > [1]
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> > [2]
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
> > [3]
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
> > [4]
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
> >
> >
> >
> > Best,
> > Danny Chan
>


Re: [DISCUSSION] Extension of Metadata Query

2019-06-04 Thread Julian Hyde
> 1. Why we have 2 methods in RelMetadataProvider?

The metadata system is complicated. We need to allow multiple handlers
for any given call. So, making a metadata call involves multiple
dispatch [1] based on the metadata method being called, the type of
RelNode, and the handlers that are registered. Also it needs to cache
results, and detect cycles. And the dispatch needs to be efficient, so
we generate janino code to do the dispatch, and re-generate when new
handlers or sub-classes of RelNode are added.

I forget details, the two methods are basically required to allow us
to generate code to do the dispatch.

> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.

I disagree. RelMetadataQuery is not an extension point. Its sole
purpose is to to keep state (the cache and cycle-checking).
RelMetadataProvider is the extension point. (By analogy, if you are
un-parsing an AST, you let each AST sub-class handle unparsing itself,
but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
is in the role of the StringBuilder. In a complex system, it is nice
to keep some of the components simple, or at least keep them to
prescribed roles.)

Julian

[1] https://en.wikipedia.org/wiki/Multiple_dispatch

On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen  wrote:
>
> Currently we provide answer to metadata query through RelMetadataProvider 
> [1], there are some sub-classes of it:
>
> RelMetadataProvider
> |
> |- VolcanoRelMetadataProvider
> |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> |- HepRelMetadataProvider
> |- CachingRelMetadataProvider
> |- ReflectiveRelMetadataProvider
> |- JaninoRelMetadataProvider
>
> The RelMetadataProvider has two methods: #apply and #handlers, the #apply 
> method seems a programming interface and there is a demo code how we can use 
> it:
>
> RelMetadataProvider provider;
> LogicalFilter filter;
> RexNode predicate;
> Function function =
> provider.apply(LogicalFilter.class, Selectivity.class};
> Selectivity selectivity = function.apply(filter);
> Double d = selectivity.selectivity(predicate);
>
> But let's see our RelOptCluster's member variables[2], there are 
> MetadataFactory and RelMetadataQuery which all can be used to query the 
> metadata, for MetadataFactory, there is a default impl named 
> MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally, 
> for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally 
> composed and codeden by JaninoRelMetadataProvider).
>
> In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to get 
> the MQ and query the metadata.
>
> For extension of metadata handlers, we can set our customized 
> RelMetadataProvider in RelOptCluster[3]. But for RelMetadataQuery, we have no 
> way to extend it now, because the RelOptCluster always has a singleton 
> instance [4] which is only the default implementation.
>
>
> My question is as follows:
>
> 1. Why we have 2 methods in RelMetadataProvider, and why we need the 
> MetadataFactory and RelMetadataProvider#apply ? It seems that it's function 
> is already been overriden by RelMetadataQuery(The difference is that 
> MetadataFactory use Reflection and RelMetadataQuery use gened bytes code).
> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
>
>
> [1] 
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataProvider.java#L38
> [2] 
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L49
> [3] 
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
> [4] 
> https://github.com/apache/calcite/blob/b0e83c469ff57257c1ea621ff943ca76f626a9b7/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L151
>
>
>
> Best,
> Danny Chan