[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899478#comment-16899478 ] Ruben Quesada Lopez commented on CALCITE-2624: -- [~zabetak], I think we can push it to 1.21, as you suggested, just adding the rule and its tests, without including to the default ruleset of Calcite. I'll take another look and the PR and merge it in the following days. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899471#comment-16899471 ] Stamatis Zampetakis commented on CALCITE-2624: -- The current PR does help with certain optimizations and in the absence of a better alternative I would propose to include it in the next release. [~rubenql] I've seen that you did a review on this. How do you feel for getting this in for 1.21.0 ? Can you push this forward? > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898630#comment-16898630 ] Haisheng Yuan commented on CALCITE-2624: I am neutral on this approach. +0. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898004#comment-16898004 ] Stamatis Zampetakis commented on CALCITE-2624: -- As far as I can see the PR is in relatively good shape and could possibly make it for 1.21.0. As [~hyuan] highlighted there are certain disadvantages with the general approach but the existing rule can still be useful especially till we handle properly the propagation of traits. I would suggest the following: * add a bit of Javadoc in the rule to outline the possible impact on performance; * do not include the rule in the default ruleset of Calcite; * finalize and merge the current PR; * explore the trait based approach suggested by [~hyuan] in other JIRAs. How do you feel with this plan? > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869451#comment-16869451 ] Stamatis Zampetakis commented on CALCITE-2624: -- [~hyuan] I think I see your point. Concretely, what you are proposing is instead of adding the generic rule in this PR to rather modify Enumerable*Join*Rule(s) to change the traits of the inputs when the join has some collation trait associated with it (and when that is possible). Please correct me if I am wrong. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869083#comment-16869083 ] Haisheng Yuan commented on CALCITE-2624: Nah, physical properties should only be requested by physical operators, not logical operators. The patch is even worse with 10-way left deep tree or right deep tree. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867657#comment-16867657 ] Stamatis Zampetakis commented on CALCITE-2624: -- That's interesting, thanks [~khawlamhb]! That means that this rule based approach may be better in terms of performance than the property based approach and the use of AbstractConverters at least for now. What do you think [~hyuan]? > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866689#comment-16866689 ] Khawla Mouhoubi commented on CALCITE-2624: -- [~zabetak] I have added SortJoinCopyRule to the default rule set and all the tests pass. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843454#comment-16843454 ] Stamatis Zampetakis commented on CALCITE-2624: -- Currently (see CALCITE-2970) and in the past we had many performance problems due to AbstractConverters. I am wondering if the current rule also suffers from the same problem. [~khawlamhb] have you tried incorporating the rule in the default rule set of Calcite? Do all the tests pass? > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841097#comment-16841097 ] Ruben Quesada Lopez commented on CALCITE-2624: -- Thanks for the explanation [~hyuan], I also see your point. The idea behind this ticket was to try to create a new rule to generalize the already existing one SortJoinTransposeRule (which works just for LEFT / RIGHT join, but also pushing limit and offset), i.e. to have a less powerful but more likely applicable rule to push a Sort into a Join. Having said that, perhaps your approach is more correct, and we should rethink the solution more in terms of physical property enforcement. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840696#comment-16840696 ] Haisheng Yuan commented on CALCITE-2624: Thanks for the explanation, [~rubenql]. I see you point, the ElasticScan in your example looks like IndexScan or IndexOnlyScan physical operator in Postgres. But I still think it is not appropriate to apply the optimization on logical operators. It might be ok to put it on physical level, but better do it via physical property enforcement, say AbstractConverter, because we are requesting some physical properties from child operators. This is the process I am expecting: {code:java} Sort(b.title, limit: 10) Join(condition: b.authorId=a.id) Scan(book) Scan(author) {code} Through physical implementation rules, we get {code:java} Sort(b.title, limit: 10) MergeJoin(condition: b.authorId=a.id) Sort (authorId) Scan(book) Sort (id) Scan(author) {code} {code:java} Sort(b.title, limit: 10) HashJoin(condition: b.authorId=a.id) Scan(book) Scan(author) {code} {code:java} Sort(b.title, limit: 10) NestedLoopJoin(condition: b.authorId=a.id) Scan(book) Scan(author) {code} Then we do property enforcement by passing down ordering or distribution requirement. For mergejoin, we can't pass the ordering requirement down to child operators, but for hash join and nestedloop join, we can pass down the request to outer child relation (suppose this is not distributed system), if the scan can't : {code:java} Limit(10) HashJoin(condition: b.authorId=a.id) Sort (Title) Scan(book)==> ElasticScan(table: book, sort: b.title) Scan(author) {code} {code:java} Limit(10) NestedLoopJoin(condition: b.authorId=a.id) Sort (Title) Scan(book)==> ElasticScan(table: book, sort: b.title) Scan(author) {code} But I am not -1 on this. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840126#comment-16840126 ] Ruben Quesada Lopez commented on CALCITE-2624: -- Thanks for your comment, [~hyuan]. I agree that, generally, this rule will not generate a more optimized plan. But there are some scenarios where it can help. I will try to summarize one: {code} -- select * from books b join authors a on b.authorId = a.id order by b.title limit 10 Sort(b.title, limit: 10) Join(condition: b.authorId=a.id) Scan(book) Scan(author) {code} The idea of this rule is to copy the Sort into the join (in this case into the left part), without the limit/offset attributes to not impact the final result: {code} -- SortJoinCopyRule applied Sort(b.title, limit: 10) Join(condition: b.authorId=a.id) Sort(b.title) Scan(book) Scan(author) {code} Now, let us say that we can combine Sort+Scan into an "ElasticScan", that allows us to scan a table already in a given order with a reasonably good performance. Also, the EnumerableLimitRule will split the original Sort (which contained a limit) into Limit+Sort: {code} -- ElasticScanRule + EnumerableLimitRule applied Limit(10) Sort(b.title) Join(condition: b.authorId=a.id) ElasticScan(table=book, sort=b.title) Scan(author) {code} And finally, the SortRemoveRule will detect that the sort input (i.e. the Join) is already sorted by b.title, so the Sort will be removed, leaving us with a final plan (much more optimized that the original one, where the Sort had to be made after the Join): {code} -- SortRemoveRule applied Limit(10) Join(condition: b.authorId=a.id) ElasticScan(table=book, sort=b.title) Scan(author) {code} > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839978#comment-16839978 ] Haisheng Yuan commented on CALCITE-2624: The PR is mixing physical property enforcement with logical plan exploration. The physical property should only be requested by physical operator by enforcing expected sort order. The patch is creating more useless plan alternatives than necessary. It only makes sense to push a top limit into the outer relation of an outer join, other than that, I don't think it is the right way to go. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2624) Add a rule to copy a sort below a join operator
[ https://issues.apache.org/jira/browse/CALCITE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824125#comment-16824125 ] Stamatis Zampetakis commented on CALCITE-2624: -- I changed a bit the summary and description of the issue to focus on joins. As Julian said, if it makes sense we can try to generalize the rule afterwards. > Add a rule to copy a sort below a join operator > --- > > Key: CALCITE-2624 > URL: https://issues.apache.org/jira/browse/CALCITE-2624 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Khawla Mouhoubi >Priority: Major > > Currently, the only rule that allows a sort to traverse a binary operator is > the SortJoinTransposeRule. The rule was introduced mainly to push limits in > the case of left and right outer joins (see CALCITE-831). > I assume that the main reason that we don't have more rules is that sorts > with limits and offsets cannot be pushed safely below many types of join > operators. However, in many cases, it is possible and beneficial for > optimization purposes to just push the sort without the limit and offset. > Since we do not know in advance if the join operator preserves the order we > cannot remove (that is why I am saying copy and not transpose) the sort > operator on top of the join. The latter is not really a problem since the > SortRemoveRule can detect such cases and remove the sort if it is redundant. > A few concrete examples where this optimization makes sense are outlined > below: > * allow the sort to be later absorbed by an index scan and disappear from > the plan (Sort + Tablescan => IndexScan with RelCollation); > * allow operators that require sorted inputs to be exploited more easily > (e.g., merge join); > * allow the sort to be performed on a possibly smaller result (assuming that > the physical binary operator that is going to be used preserves the order of > left/right input and the top sort operator can be removed entirely). > I propose to add a new rule (e.g., SortCopyBelowJoinRule, > SortJoinCopyBelowRule) which allows a sort to be copied to the left or right > (or to both if it is rather easy to decompose the sort) of a join operator > (excluding the limit and offset attributes) if the respective inputs are not > already sorted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)