Re: Using indexes rather than table scans with Calcite

2020-06-01 Thread Xiening Dai
Hi Roman, The example you mentioned is an advanced scenario. Note that there are different types of index, such as clustered index, secondary index, covered and non-covered index. In your case, typical OLTP/OLAP optimizer would create an index-based join on top of the range table scan (or

[jira] [Created] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-05 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3972: Summary: Allow RelBuilder to create RelNode with convention and use it for trait convert Key: CALCITE-3972 URL: https://issues.apache.org/jira/browse/CALCITE-3972

Re: [ANNOUNCE] New committer: Jin Xing

2020-05-01 Thread Xiening Dai
Congrats Jin Xing. Well deserved! > On Apr 30, 2020, at 7:54 PM, XING JIN wrote: > > Thanks a lot, Julian ~ > I'm not from MaxCompute team, but from big data platform in Alibaba Ant > Financial Group. > Actually we cooperate a lot with MaxCompute, it's our sister team. > > Jin > > Julian Hyde

[jira] [Created] (CALCITE-3963) Maintains logical properties at RelSet (equivalent group) instead of RelNode

2020-04-29 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3963: Summary: Maintains logical properties at RelSet (equivalent group) instead of RelNode Key: CALCITE-3963 URL: https://issues.apache.org/jira/browse/CALCITE-3963

Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-28 Thread Xiening Dai
g spread out >> over, and duplicated in, many RelOptRule instances. >> >> We should be ambitious, and aim to make RelBuilder useful for creating >> most physical nodes. Perhaps make modest extensions to RelBuilder and >> RelFactory APIs to achieve this. If we fail, people can

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
e on everyone. The multi-convention case is > still crucial for planning hybrid queries (e.g. joining MySQL to MongoDB). > > Julian > > >> On Apr 27, 2020, at 4:28 PM, Xiening Dai wrote: >> >> Hi Julian, >> >> In my view, separating logic and phys

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
f >>>>>>>>> VolcanoPlanner during >>>>>>>>> the initial stages of implementing the new planner is absolutely >>>>>>>> CRITICAL. >>>>>>>>> As a lot of users of Calcite do various customizat

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
t;>>>>>>>> which can be left untouched. >>>>>>>>> >>>>>>>>> Haisheng >>>>>>>>> >>>>>>>>> On 2020/04/21 06:33:08, Андрей Цвелодуб >>>> wrote: >>>

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Xiening Dai
ntation is amount of >>> tricks that we need to get desired behaviour. It would be great if some of >>> issues (or all of them) were solved in the new approach. >>> >>> Regards, >>> Igor >>> >>> пн, 20 апр. 2020 г., 7:02 Xie

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Xiening Dai
Hi Igor, Your comment - "because actual cost may be calculated correctly using physical operators only. So won't be able to implement Branch and Bound Space Pruning.“ is actually not true. In Cascade’s lower bound / upper bound pruning algorithm, you can get cost lower bound of input RelNode

[jira] [Created] (CALCITE-3915) Add rule listener to report rule attempts and time at DEBUG log level

2020-04-10 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3915: Summary: Add rule listener to report rule attempts and time at DEBUG log level Key: CALCITE-3915 URL: https://issues.apache.org/jira/browse/CALCITE-3915 Project

Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-08 Thread Xiening Dai
> > The worst case is that only physical filter, project, sort can be created > using the physical RelBuilder. > > - Haisheng > > ---------- > 发件人:Xiening Dai > 日 期:2020年04月08日 07:36:43 > 收件人: > 主 题

[DICUSS] Support building physical RelNode in Calcite

2020-04-07 Thread Xiening Dai
Hi all, In light of CALCITE-2970, I’d like to initiate a discussion. Currently the framework itself does not have a way to create physical RelNode (RelNode with a particular convention). We completely rely on adapter rules to convert logical nodes into physical ones. There are a few major

[jira] [Created] (CALCITE-3821) RelOptUtil::containsMultisetOrWindowedAgg doesn't really check multiset

2020-02-25 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3821: Summary: RelOptUtil::containsMultisetOrWindowedAgg doesn't really check multiset Key: CALCITE-3821 URL: https://issues.apache.org/jira/browse/CALCITE-3821 Project

Re: [DISCUSS] Commit messages, again

2020-02-21 Thread Xiening Dai
email thread, but my opinion is that 'it depends'. We should not apply > the same rule (CTR, RTC) for every commit. Committers should have > discretion. > > Julian > > On Fri, Feb 21, 2020 at 12:51 PM Xiening Dai wrote: >> >> I also no

Re: [DISCUSS] Commit messages, again

2020-02-21 Thread Xiening Dai
I also notice that this particular change (https://github.com/apache/calcite/commit/938614cca8c30ed9ff48996ff0ae42e1ed4f1706) was committed without going through code review. Do we have any process in place for merging a change? With the size and complexity of Calcite project, I would expect

[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3807: Summary: checkForSatisfiedConverters() is unnecessary Key: CALCITE-3807 URL: https://issues.apache.org/jira/browse/CALCITE-3807 Project: Calcite Issue Type

Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest

2020-02-16 Thread Xiening Dai
I think CALCITE-3713 is overall a positive change. We have seen the number of rule applies are reduced in our test (not significant though). After fixing the performance issue, CALCITE-3713 is fine and IMO it should stay. Having say that, I understand the pain of updating test base line. So

[jira] [Created] (CALCITE-3785) HepPlanner.belongToDag() doesn't have to use mapDigestToVertex

2020-02-11 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3785: Summary: HepPlanner.belongToDag() doesn't have to use mapDigestToVertex Key: CALCITE-3785 URL: https://issues.apache.org/jira/browse/CALCITE-3785 Project: Calcite

[jira] [Created] (CALCITE-3768) VolcanoPlanner.changeTraitsUsingConverters() has parameter that's never used

2020-02-04 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3768: Summary: VolcanoPlanner.changeTraitsUsingConverters() has parameter that's never used Key: CALCITE-3768 URL: https://issues.apache.org/jira/browse/CALCITE-3768

Re: [DISCUSS] propagateCostImprovements vs incremental bestCost maintenance vs metadata

2020-01-08 Thread Xiening Dai
That’s why I say it’s hard to solve under current framework design. The example query you provide can be, and should be, optimized during logical transformation phase. At that moment, there shouldn’t be any cost calculation since all we are doing is to explore equivalences. Once the

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

2020-01-08 Thread Xiening Dai
I see. But that’s unrelated to join ordering. > On Jan 7, 2020, at 11:29 PM, Danny Chan wrote: > > Internally we have a multi-inputs merge join, for each input there maybe a > collation permutations. > > Best, > Danny Chan > 在 2020年1月8日 +0800 AM1:20,Xiening Dai ,写道:

Re: [DISCUSS] propagateCostImprovements vs incremental bestCost maintenance vs metadata

2020-01-08 Thread Xiening Dai
This is similar to CALCITE-2166 where a RelNode’s best cost could increase after its input RelSubset cardinality is changed. Unfortunately there’s no easy way to fix this with current framework design. In theory, the cardinality and uniqueness of a RelSubset should never changed per definition

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

2020-01-07 Thread Xiening Dai
>>> of physical traits. In fact, both your and my strategy propagate traits >>>>>> bottom-up, but I do this through rules, which also fire bottom-up, >>> while >>>>> in >>>>>> your case only the traits are propagated bottom-up, w

Re: CALCITE-2905: Maven -> Gradle: any thoughts

2019-11-21 Thread Xiening Dai
In case you didn’t know, JetBrains offers a free subscription to all Apache committers. https://blog.jetbrains.com/blog/2019/05/30/jetbrains-supports-the-apache-software-foundation/ > On Nov 21,

Re: CALCITE-2905: Maven -> Gradle: any thoughts

2019-11-19 Thread Xiening Dai
job/hd165mjr65gj0xal> > On Nov 19, 2019, at 4:07 PM, Xiening Dai wrote: > > First want to thank you for woking on this. I try it today, and overall it > looks great. > > I also notice this output at the end, is this a known issue? > > Deprecated Gradle features w

Re: CALCITE-2905: Maven -> Gradle: any thoughts

2019-11-19 Thread Xiening Dai
First want to thank you for woking on this. I try it today, and overall it looks great. I also notice this output at the end, is this a known issue? Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0. Use '--warning-mode all' to show the individual

Re: Optimizer: All the inputs have relevant nodes, however the cost is still infinite.

2019-11-13 Thread Xiening Dai
No, I didn’t get a chance. Thanks for diagnosing. > On Nov 13, 2019, at 7:55 AM, Vladimir Sitnikov > wrote: > > Xiening, > > The bug is "obvious" when you open a Graphviz diagram. > Just wondering: have you tried? > > The issue is EnumerableTableFunctionScan consumes Subset#0.NONE.[0] as an

[jira] [Created] (CALCITE-3491) VolcanoPlanner.completeConversion() is bypassed by "if (true)"

2019-11-11 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3491: Summary: VolcanoPlanner.completeConversion() is bypassed by "if (true)" Key: CALCITE-3491 URL: https://issues.apache.org/jira/browse/CALCITE-3491 Projec

Re: [ANNOUNCE] Haisheng Yuan joins Calcite PMC

2019-11-10 Thread Xiening Dai
Congrats Haisheng! Well deserved. > On Nov 10, 2019, at 7:45 PM, Francis Chuang wrote: > > I'm pleased to announce that Haisheng has accepted an invitation to > join the Calcite PMC. Haisheng has been a consistent and helpful > figure in the Calcite community for which we are very grateful. We

Re: Optimizer: All the inputs have relevant nodes, however the cost is still infinite.

2019-11-08 Thread Xiening Dai
Are you able to provide a test case to repro on current code base? From the error message, the given subset’s best RelNode is not available, aka cost is infinite. There could be multiple cases leading to it. > On Nov 8, 2019, at 11:28 AM, Rui Wang wrote: > > My current guess is it could

[jira] [Created] (CALCITE-3487) Should not hard code RelMetadataQuery class in VolcanoPlanner.isValid()

2019-11-08 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3487: Summary: Should not hard code RelMetadataQuery class in VolcanoPlanner.isValid() Key: CALCITE-3487 URL: https://issues.apache.org/jira/browse/CALCITE-3487 Project

Re: [jira] [Commented] (CALCITE-2970) Performance issue when enabling abstract converter for EnumerableConvention

2019-11-06 Thread Xiening Dai
Yes, I agree withConvention or push/popConvention is better. But I don’t understand the last comment. Why do we need to worry about copy()? > On Nov 6, 2019, at 11:40 AM, Julian Hyde (Jira) wrote: > > >[ >

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

2019-11-01 Thread Xiening Dai
uch concepts. The original >> paper has a classification of "transformation rules" which operates on >> logical relation expression and "implementation rules" which provides >> the mapping to physical operators. >> >> >> 1. https://paperhub.s3.am

Re: [DISCUSS] On-demand traitset request

2019-11-01 Thread Xiening Dai
nner, I feel the root issue > of long planning time is not to explore all possible satisfying trait. > It is actually the unnecessary of AbstractConverter, added to the > equivalence class. > > > On Fri, Oct 18, 2019 at 10:39 PM Xiening Dai wrote: >> >> Thanks f

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

2019-10-31 Thread Xiening Dai
Actually we solved this problem in our setup using a mechanism called “Pull-Up Traits”, which explores the possible trait set of children’s input to decide parent’s physical properties. In order to determine child input trait, you would have to look at child’s children, and all the way to the

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

2019-10-31 Thread Xiening Dai
Hi Vladimir, I think for short/mid term, #2 way (using AbstractConverter) should work after we fix CALCITE-2970. We already understand the root cause, now are looking at the best way to fix it. If you cannot wait, you can also create your own converter rule so it won’t generate logical node,

Re: [ANNOUNCE] Danny Chan joins Calcite PMC

2019-10-30 Thread Xiening Dai
Congratulations, Danny! > On Oct 30, 2019, at 2:22 PM, Francis Chuang wrote: > > I'm pleased to announce that Danny has accepted an invitation to > join the Calcite PMC. Danny has been a consistent and helpful > figure in the Calcite community for which we are very grateful. We > look forward

Re: Same rules fired for logical and physical nodes

2019-10-28 Thread Xiening Dai
For #1, can you please give an example? For #2, I run the calcite UT, and they all passed. Is Durid test part of the UT? If not, how do I run them? Thanks. > On Oct 28, 2019, at 7:57 PM, Danny Chan wrote: > > Thanks Xiening Dai for bringing up this topic ~ > > I agre

Re: Same rules fired for logical and physical nodes

2019-10-28 Thread Xiening Dai
I open a PR - https://github.com/apache/calcite/pull/1543 > On Oct 28, 2019, at 3:00 PM, Xiening Dai wrote: > > Thanks for you input. > > I think this doesn’t relates to the Volcano theory. I don’t try to separate > the logical transformation from physical implementati

Re: Same rules fired for logical and physical nodes

2019-10-28 Thread Xiening Dai
ules not only match physical operaors, but still generate >> new logical operators. >> >> IMHO, rules should match logical operators only. >> >> [1] https://github.com/apache/calcite/pull/1130 >> >> On 2019/10/28 16:47:55, Xiening Dai wrote: >>

[jira] [Created] (CALCITE-3455) Redundant rule firing for both logical and physical nodes

2019-10-28 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3455: Summary: Redundant rule firing for both logical and physical nodes Key: CALCITE-3455 URL: https://issues.apache.org/jira/browse/CALCITE-3455 Project: Calcite

Same rules fired for logical and physical nodes

2019-10-28 Thread Xiening Dai
Hi all, While I was looking at CALCITE-2970, I noticed that some of the rules are fired for both logical and physical nodes. For example, ProjectMergeRule matches Project.class, so it’s fired for LogicalProject. But then after LogicalProject is converted into EnummerableProject, the same rule

Re: [DISCUSS] State of the project 2019

2019-10-22 Thread Xiening Dai
+1 on the improvement of pull request turnaround time. I also see a few new committers who are really active. They help review PRs and answer mail list questions, which really benefits the community and people who’d like to make code contribution. I also feel like Calcite has big code base

Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Xiening Dai
Thanks for the sharing. I like the way you model this problem, Jinfeng. There’s one minor issue with your example. Let say if R and S doesn’t have sorting properties at all. In your case, we would end up adding enforcers for LHS and RHS to get collation (a, b, c). Then we would need another

Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Xiening Dai
ptCall, he/she will need to create a new > instance of RelMetadataQuery. > > Xiening Dai 于2019年10月17日周四 上午2:27写道: > >> I have seen both patterns in current code base. In most places, for >> example SubQueryRemoveRule, AggregateUnionTrasposeRule >> SortJoinTransposeR

Re: [DISCUSSION] Extension of Metadata Query

2019-10-16 Thread Xiening Dai
I have seen both patterns in current code base. In most places, for example SubQueryRemoveRule, AggregateUnionTrasposeRule SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And there are a few other places where new RelMetadataQuery instance is created, which Haisheng

Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

2019-10-08 Thread Xiening Dai
In my opinion, we will need both - supporting IN operation (either through an operator or an internal function) and adding support for building a balanced tree. It’s always good to be resilient and capable of handling edge cases. The IN support might require more work. Haisheng’s proposal is a

Re: CALCITE-1048 Make metadata more robust

2019-09-07 Thread Xiening Dai
. > On Sep 5, 2019, at 4:29 PM, Xiening Dai wrote: > > Hi all, > > I came across this JIRA and am not sure what its status is. The original > proposal in the bug looks reasonable to me. We currently propagate importance > improvement through a depth-first model, which

[jira] [Created] (CALCITE-3330) propagateCostImprovements() could result in stack overflow

2019-09-07 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3330: Summary: propagateCostImprovements() could result in stack overflow Key: CALCITE-3330 URL: https://issues.apache.org/jira/browse/CALCITE-3330 Project: Calcite

CALCITE-1048 Make metadata more robust

2019-09-05 Thread Xiening Dai
Hi all, I came across this JIRA and am not sure what its status is. The original proposal in the bug looks reasonable to me. We currently propagate importance improvement through a depth-first model, which could result in stack overflow if the memo is very big and very deep. Change it into a

Re: Apply for contributor

2019-08-22 Thread Xiening Dai
My user name on jira is xndai. Thanks. > On Aug 22, 2019, at 2:36 PM, Xiening Dai wrote: > > Hi all, > > May I have the contributor permission for Calcite project? I’d like to work > on some issue related to VolcanoPlanner. Thanks.

Apply for contributor

2019-08-22 Thread Xiening Dai
Hi all, May I have the contributor permission for Calcite project? I’d like to work on some issue related to VolcanoPlanner. Thanks.

[jira] [Created] (CALCITE-3283) RelSubSet's best is not existed in the set

2019-08-22 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3283: Summary: RelSubSet's best is not existed in the set Key: CALCITE-3283 URL: https://issues.apache.org/jira/browse/CALCITE-3283 Project: Calcite Issue Type

[jira] [Created] (CALCITE-3257) Need to clear RelMetaDataQuery cache after dumping RelNodes and Graphviz

2019-08-16 Thread Xiening Dai (JIRA)
Xiening Dai created CALCITE-3257: Summary: Need to clear RelMetaDataQuery cache after dumping RelNodes and Graphviz Key: CALCITE-3257 URL: https://issues.apache.org/jira/browse/CALCITE-3257 Project

Re: What is the exactly definition as an equi join ?

2019-04-15 Thread Xiening Dai
I think Calcite always pushes down equal join conditions. In SqlToRelConverter.createJoin(), before ruction returns, it calls RelOptUtil.pushDownJoinConditions(). So in your example, the cast expression will be pushed down and it will still be an equal join. > On Apr 15, 2019, at 5:40 PM,

Re: Gandiva

2018-06-22 Thread Xiening Dai
I was in a talk regarding Gandiva yesterday. Impressive work! But I am not sure why Calcite would like to integrate with it. To me Gandiva is on execution side, in which scenarios a query planner would need a arrow engine? I read the original Jira about implementing file enumerator, but the

Re: RLE v1 versus RLE v2

2018-01-31 Thread Xiening Dai
g > > > > 2018-01-31 19:31 GMT-05:00 Xiening Dai <xndai@live.com>: > >> Hi, >> >> I am evaluating if we should work as a high priority to add RLEv2 support >> in Orc C++ writer. I wonder if anyone has any performance data to share >> regardi