from:"Botong Huang \(Jira\)"

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2022-07-14 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567115#comment-17567115
 ] 

Botong Huang commented on CALCITE-4568:
---

Hi [~julianhyde], a gentle reminder for this thread. Thx!

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2022-01-19 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478613#comment-17478613
 ] 

Botong Huang edited comment on CALCITE-4568 at 1/19/22, 11:53 AM:
--

[~Leo Zhou] yes, it will probably take some time to digest and review 
considering the amount of code here. See 
https://github.com/hbtoo/calcite/commits/botong
[~julianhyde] please let us know if there's anything we can help. The first 
commit CALCITE-4737 is in already, thanks! 


was (Author: botong):
[~Leo Zhou] yes, it will probably take some time to digest and review 
considering the amount of code here. 
[~julianhyde] please let us know if there's anything we can help. The first 
commit CALCITE-4737 is in already, thanks! 

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2022-01-19 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478613#comment-17478613
 ] 

Botong Huang commented on CALCITE-4568:
---

[~Leo Zhou] yes, it will probably take some time to digest and review 
considering the amount of code here. 
[~julianhyde] please let us know if there's anything we can help. The first 
commit CALCITE-4737 is in already, thanks! 

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (CALCITE-4737) Add Volcano visualizer for debugging

2021-09-17 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17417002#comment-17417002
 ] 

Botong Huang commented on CALCITE-4737:
---

[~zuozhiw] and [~thomas.rebele], any updates? Please let me know if there's 
anything I can help. Thanks!

> Add Volcano visualizer for debugging
> 
>
> Key: CALCITE-4737
> URL: https://issues.apache.org/jira/browse/CALCITE-4737
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Add Volcano visualizer for debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4737) Add Volcano visualizer for debugging

2021-08-14 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399232#comment-17399232
 ] 

Botong Huang commented on CALCITE-4737:
---

Sounds good. We will work with [~thomas.rebele] to come up with something. 

> Add Volcano visualizer for debugging
> 
>
> Key: CALCITE-4737
> URL: https://issues.apache.org/jira/browse/CALCITE-4737
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>
> Add Volcano visualizer for debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-4737) Add Volcano visualizer for debugging

2021-08-13 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398858#comment-17398858
 ] 

Botong Huang edited comment on CALCITE-4737 at 8/13/21, 6:43 PM:
-

Yes, the current visualizer provides full support for tvr, and it compiles with 
the following two commits. So do we want to wait for these two or do we add a 
stripped version of the visualizer now? 

[59342a654|https://github.com/hbtoo/calcite/commit/59342a6547b5357ce1a83335360a81f042b77507]
 2. Tempura core memo structure, rule engine, and interfaces. Newly added files 
only.
[0d310841d|https://github.com/hbtoo/calcite/commit/0d310841de431e1971ff7330bc3928364d06a362]
 1. Tempura core memo structure, rule engine, and interfaces. Modifications to 
existing files only.


was (Author: botong):
[link title|http://example.com]Yes, the current visualizer provides full 
support for tvr, and it compiles with the following two commits. So do we want 
to wait for these two or do we add a stripped version of the visualizer now? 

[59342a654|https://github.com/hbtoo/calcite/commit/59342a6547b5357ce1a83335360a81f042b77507]
 2. Tempura core memo structure, rule engine, and interfaces. Newly added files 
only.
[0d310841d|https://github.com/hbtoo/calcite/commit/0d310841de431e1971ff7330bc3928364d06a362]
 1. Tempura core memo structure, rule engine, and interfaces. Modifications to 
existing files only.

> Add Volcano visualizer for debugging
> 
>
> Key: CALCITE-4737
> URL: https://issues.apache.org/jira/browse/CALCITE-4737
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>
> Add Volcano visualizer for debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4737) Add Volcano visualizer for debugging

2021-08-13 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398858#comment-17398858
 ] 

Botong Huang commented on CALCITE-4737:
---

[link title|http://example.com]Yes, the current visualizer provides full 
support for tvr, and it compiles with the following two commits. So do we want 
to wait for these two or do we add a stripped version of the visualizer now? 

[59342a654|https://github.com/hbtoo/calcite/commit/59342a6547b5357ce1a83335360a81f042b77507]
 2. Tempura core memo structure, rule engine, and interfaces. Newly added files 
only.
[0d310841d|https://github.com/hbtoo/calcite/commit/0d310841de431e1971ff7330bc3928364d06a362]
 1. Tempura core memo structure, rule engine, and interfaces. Modifications to 
existing files only.

> Add Volcano visualizer for debugging
> 
>
> Key: CALCITE-4737
> URL: https://issues.apache.org/jira/browse/CALCITE-4737
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>
> Add Volcano visualizer for debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-08-11 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397799#comment-17397799
 ] 

Botong Huang commented on CALCITE-4568:
---

Sounds good and please take your time. Our team will be participating remotely 
:D

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-08-06 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395112#comment-17395112
 ] 

Botong Huang commented on CALCITE-4568:
---

(copied from dev email)

Hi all, 

Please find our rebased Tempura code on top of a fairly recent version of 
Calcite at:
https://github.com/hbtoo/calcite/tree/botong

There are six new commits in total:
739119282 Hack around the type issue to make tvr unit tests work, see 
CALCITE-4713
c84da7216 4. Full tempura integration: all TVR rules and complete optimization 
procedure. Newly added files only.
d3217f432 3. Full tempura integration: all TVR rules and complete optimization 
procedure. Modifications to existing files only.
59342a654 2. Tempura core memo structure, rule engine, and interfaces. Newly 
added files only.
0d310841d 1. Tempura core memo structure, rule engine, and interfaces. 
Modifications to existing files only.
c1240ca7b 0. Add volcano visualizer for debugging.

The first three (0, 1, 2) is a compilable version with extended core system 
support:
1. Memo extension with TvrMetaSet
2. Rule engine upgrade, capable of matching TvrMetaSet and RelNodes, as well as 
links in between the nodes.

The next two (3 and 4) is a full version with:
All changes in this feature will consist of four parts:
3. A provided set of TvrRules, written using the upgraded rule engine API.
4. TvrVolcanoPlanner that puts everything together end to end.
5. Multi-query optimization, used to find the best incremental plan involving 
multiple time points.

With up to 4, all existing CALCITE unit tests pass.


To demonstrate how Tempura works, we have added the following two example unit 
tests that can be run directly (with the last commit to hack around 
CALCITE-4713):

TvrOptimizationTest.java runs the Tempura optimizer. This program produces a 
progressive physical plan by the Tempura optimizer that runs across several 
time points. The physical plan is printed out to the console in DOT format, 
which can be viewed using an online graphviz tool.

TvrExecutionTest.java uses the Tempura optimizer in an end-to-end query. This 
program generates a progressive physical plan and then uses Calcite's built-in 
executor to run the plan. The output at each time point is printed to the 
console.

Everyone is welcome and encouraged to take a look and play with it. Let's take 
some time and figure out a plan on how to incorporate Tempura into Calcite that 
best suits everyone.  

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-05-17 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346265#comment-17346265
 ] 

Botong Huang commented on CALCITE-4568:
---

Hi Rui, the current version is runnable but we did not take care of all test 
cases. The code can be compiled with: 

mvn clean install -DskipTests -Dmaven.javadoc.skip=true 
-Dforbiddenapis.skip=true -Dcheckstyle.skip=true

Yes, we are rebasing it to a more recent version of gradle based master. 

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-05-14 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344955#comment-17344955
 ] 

Botong Huang edited comment on CALCITE-4568 at 5/15/21, 3:11 AM:
-

Please find the slides introducing Tempura here: 

[https://github.com/alibaba/cost-based-incremental-optimizer/blob/main/Tempura_Calcite_presentation.pdf]


was (Author: botong):
Please find the slides introducing Tempura here: 
[^Tempura_Calcite_presentation.pdf]

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-05-14 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344955#comment-17344955
 ] 

Botong Huang commented on CALCITE-4568:
---

Please find the slides introducing Tempura here: 
[^Tempura_Calcite_presentation.pdf]

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-4568) Tempura: extending Calcite into an incremental query optimizer

2021-04-07 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-4568:
--
Summary: Tempura: extending Calcite into an incremental query optimizer  
(was: Tempura: extending Calcite into a incremental query optimizer)

> Tempura: extending Calcite into an incremental query optimizer
> --
>
> Key: CALCITE-4568
> URL: https://issues.apache.org/jira/browse/CALCITE-4568
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Botong Huang
>Priority: Major
>
> As discussed in the email thread, this is an attempt to extend the Calcite 
> optimizer into a general incremental query optimizer, based on our research 
> paper published in VLDB 2021:
> Tempura: a general cost-based optimizer framework for incremental data 
> processing
> To our best knowledge, this is the first general cost-based incremental 
> optimizer that can find the best plan across multiple families of incremental 
> computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
> the paper) shows that the generated best plan is consistently much better 
> than the plans from each individual method alone.
> In general, incremental query planning is central to database view 
> maintenance and stream processing systems, and are being adopted in active 
> databases, resumable query execution, approximate query processing, etc. We 
> are hoping that this feature can help widening the spectrum of Calcite, 
> solicit more use cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-4568) Tempura: extending Calcite into a incremental query optimizer

2021-04-07 Thread Botong Huang (Jira)

Botong Huang created CALCITE-4568:
-

 Summary: Tempura: extending Calcite into a incremental query 
optimizer
 Key: CALCITE-4568
 URL: https://issues.apache.org/jira/browse/CALCITE-4568
 Project: Calcite
  Issue Type: New Feature
Reporter: Botong Huang


As discussed in the email thread, this is an attempt to extend the Calcite 
optimizer into a general incremental query optimizer, based on our research 
paper published in VLDB 2021:
Tempura: a general cost-based optimizer framework for incremental data 
processing

To our best knowledge, this is the first general cost-based incremental 
optimizer that can find the best plan across multiple families of incremental 
computing methods, including IVM, Streaming, DBToaster, etc. Experiments (in 
the paper) shows that the generated best plan is consistently much better than 
the plans from each individual method alone.

In general, incremental query planning is central to database view maintenance 
and stream processing systems, and are being adopted in active databases, 
resumable query execution, approximate query processing, etc. We are hoping 
that this feature can help widening the spectrum of Calcite, solicit more use 
cases and adoption of Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-03-01 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293313#comment-17293313
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi [~danny0405], openning a new Jira for it might be an overkill. Can you fix 
the typo for us and push? 

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-03-01 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293313#comment-17293313
 ] 

Botong Huang edited comment on CALCITE-4514 at 3/2/21, 3:12 AM:


Hi [~danny0405], openning a new Jira for it might be an overkill. Can you fix 
the typo for us and push? Thx! 


was (Author: botong):
Hi [~danny0405], openning a new Jira for it might be an overkill. Can you fix 
the typo for us and push? 

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-28 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292624#comment-17292624
 ] 

Botong Huang commented on CALCITE-4514:
---

Looks good to me, thanks!

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-28 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292537#comment-17292537
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi [~julianhyde], I just noticed some minor issues. For instance in 
_isSmaller()_, when _set1.parents.size() > set2.parents.size()_, we should 
return false rather than fall back to relset size and id. 

We need this additional patch: 
[https://github.com/apache/calcite/compare/master...hbtoo:CALCITE-4514]

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-28 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292465#comment-17292465
 ] 

Botong Huang commented on CALCITE-4514:
---

Sounds good to me, thx!

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-27 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292302#comment-17292302
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi [~julianhyde], thanks! Your patch is almost the same as mine. I've 
simplified and integrated it in the latest patch. 

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-27 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292270#comment-17292270
 ] 

Botong Huang commented on CALCITE-4514:
---

Oh I see, updated patch back to the compute from scratch. 

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-4514) [CALCITE-4514] Fine tune the merge order of two RelSets

2021-02-27 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-4514:
--
Summary: [CALCITE-4514] Fine tune the merge order of two RelSets  (was: 
Fine tune the merge order of two RelSets, cache RelSet's childSet computation)

> [CALCITE-4514] Fine tune the merge order of two RelSets
> ---
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) Fine tune the merge order of two RelSets, cache RelSet's childSet computation

2021-02-27 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292265#comment-17292265
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi [~vladimirsitnikov]
{quote}Is it really needed to compute the set?
{quote}
Yes I believe so, detecting the parent child relset relationship is an 
optimization to disable useless cycles in the memo. It is added in 
CALCITE-3819, adding [~hyuan] to verify just in case.
{quote}caching the Set<...> looks like to be "always faster", however, it might 
result in an unintentional memory leak.
{quote}
I did a quick test over some heavy queries. With the patch, there is no perf 
downgrade, at the same time the improvement is barely un-noticeable as well. 
This is expected because the patch does not change the behavior for most merge 
cases. Again, we did find noticeable perf difference in the query that hits the 
special case, which is resolved by this patch.

That said, I agree that adding the cache comes at the cost of possibly pinning 
some stale RelSets in memory. I don't have a strong preference between cache vs 
compute from scratch every time. [~julianhyde] What do you think?

> Fine tune the merge order of two RelSets, cache RelSet's childSet computation
> -
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) Fine tune the merge order of two RelSets, cache RelSet's childSet computation

2021-02-27 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292245#comment-17292245
 ] 

Botong Huang commented on CALCITE-4514:
---

getChildSets is now changed to compute and maintain on demand. It is 
essentially a cache and for most RelSets it will be null. It is at least better 
than compute from scratch every time it is needed. I am not sure if there is a 
better way. 

Size of RelSet is already there RelSet.rels, we do not add anything tracking in 
this patch. 

Performance side, this Jira is originally intended to fix and improve 
performance in a special case (when two relsets are parent sets of each other), 
which the added ut covered. For this special case, the performance improvement 
is straightforward. But for the overall perf, I am not sure if there is an easy 
way for a ut to quantify the improvement, without a comprehensive benchmark of 
queries. 


> Fine tune the merge order of two RelSets, cache RelSet's childSet computation
> -
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) Prefer merge new relset into old relset when they are parent set of each other

2021-02-26 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292007#comment-17292007
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi Julian, thanks for the comments. I believe I've addressed all of them. 
Please take an another look. 

> Prefer merge new relset into old relset when they are parent set of each other
> --
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-4514) Fine tune the merge order of two RelSets, cache RelSet's childSet computation

2021-02-26 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-4514:
--
Summary: Fine tune the merge order of two RelSets, cache RelSet's childSet 
computation  (was: Prefer merge new relset into old relset when they are parent 
set of each other)

> Fine tune the merge order of two RelSets, cache RelSet's childSet computation
> -
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4514) Prefer merge new relset into old relset when they are parent set of each other

2021-02-26 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291869#comment-17291869
 ] 

Botong Huang commented on CALCITE-4514:
---

Hi Vladimir, yes, you are right. Both preferences are already there. Except 
that in the case where both sets are parent of each other, the second 
preference is not considered. 

> Prefer merge new relset into old relset when they are parent set of each other
> --
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-4514) Prefer merge new relset into old relset when they are parent set of each other

2021-02-25 Thread Botong Huang (Jira)

Botong Huang created CALCITE-4514:
-

 Summary: Prefer merge new relset into old relset when they are 
parent set of each other
 Key: CALCITE-4514
 URL: https://issues.apache.org/jira/browse/CALCITE-4514
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


When merging two relsets, we have two preferences: 
1. Merge parent relset into child relset
2. Merge newer relset into older relset
Currently, when the two relsets are parent set of each other, we randomly pick 
a merge order without checking the second condition above. For performance 
reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4302) Improve cost propagation in volcano to avoid re-propagation

2020-11-02 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225047#comment-17225047
 ] 

Botong Huang commented on CALCITE-4302:
---

Yes I agree. We can handle null in Volcano if needed by considering null as 
infinite cost. But I do think CALCITE-4372 is a better way. 

> Improve cost propagation in volcano to avoid re-propagation
> ---
>
> Key: CALCITE-4302
> URL: https://issues.apache.org/jira/browse/CALCITE-4302
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. 
> However, there is still room for improvement. A subset can be updated more 
> than once in a cost propagation process. For instance, A -> D, A -> B -> C -> 
> D. When subset A has an update, using BFS subset D (and thus all subsets 
> above/after D) can be updated twice, first via A -> D and then C -> D. We can 
> further improve the BFS by always popping the relNode with the smallest cost 
> from the queue, similar to the Dijkstra algorithm. So that whenever a relNode 
> is popped from the queue, its current best cannot be further deceased any 
> more. As a result, all subsets will only be propagated at most once. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4302) Improve cost propagation in volcano to avoid re-propagation

2020-11-02 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17224996#comment-17224996
 ] 

Botong Huang commented on CALCITE-4302:
---

Hi [~vladimirsitnikov], thanks for letting us know, this deserves a bit more 
investigation. 

https://github.com/apache/calcite/commit/c7fdae22fb0e6b220152f05a7343ce5569283a83#diff-23f79975c4d8d77b7a09299831fe47c69c4f65fe8639bc4ceb956321a33da634L415

As you can see, before this patch, the code didn't check for null as well. 

I did a quick search, everywhere else using getCost(parent, mq) doesn't seem to 
consider the possibiliity that it can return null. Here the RelNode parent 
should not be a RelSubset, so the only possibility is that 
mq.getNonCumulativeCost(rel) inside getCost(parent, mq) returned null. 
Furthermore, although getNonCumulativeCost's comment says it can return null, 
everywhere using getNonCumulativeCost also assumed that it won't return null.

To sum up, it doen't seem like a new issue, can you double check where does the 
null comes from? If it is always there, it should have triggered a NPE earlier 
already. 

> Improve cost propagation in volcano to avoid re-propagation
> ---
>
> Key: CALCITE-4302
> URL: https://issues.apache.org/jira/browse/CALCITE-4302
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. 
> However, there is still room for improvement. A subset can be updated more 
> than once in a cost propagation process. For instance, A -> D, A -> B -> C -> 
> D. When subset A has an update, using BFS subset D (and thus all subsets 
> above/after D) can be updated twice, first via A -> D and then C -> D. We can 
> further improve the BFS by always popping the relNode with the smallest cost 
> from the queue, similar to the Dijkstra algorithm. So that whenever a relNode 
> is popped from the queue, its current best cannot be further deceased any 
> more. As a result, all subsets will only be propagated at most once. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-4302) Improve cost propagation in volcano to avoid re-propagation

2020-10-02 Thread Botong Huang (Jira)

Botong Huang created CALCITE-4302:
-

 Summary: Improve cost propagation in volcano to avoid 
re-propagation
 Key: CALCITE-4302
 URL: https://issues.apache.org/jira/browse/CALCITE-4302
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. However, 
there is still room for improvement. A subset can be updated more than once in 
a cost propagation process. For instance, A -> D, A -> B -> C -> D. When subset 
A has an update, using BFS subset D (and thus all subsets above/after D) can be 
updated twice, first via A -> D and then C -> D. We can further improve the BFS 
by always popping the relNode with the smallest cost from the queue, similar to 
the Dijkstra algorithm. So that whenever a relNode is popped from the queue, 
its current best cannot be further deceased any more. As a result, all subsets 
will only be propagated at most once. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3991) The required boolean should always be provided in RelSet.getOrCreateSubset()

2020-06-03 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3991:
--
Summary: The required boolean should always be provided in 
RelSet.getOrCreateSubset()  (was: the required boolean should always be 
provided in RelSet.getOrCreateSubset())

> The required boolean should always be provided in RelSet.getOrCreateSubset()
> 
>
> Key: CALCITE-3991
> URL: https://issues.apache.org/jira/browse/CALCITE-3991
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> the required boolean should always be provided in RelSet.getOrCreateSubset(). 
> Deleting the old default as well as other related code cleanup



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-06-03 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125322#comment-17125322
 ] 

Botong Huang commented on CALCITE-3981:
---

Thanks [~julianhyde] and [~hyuan] for the review!

> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the registering subset itself, register should always 
> use canonize() to find and return the equivalent active subset instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4029) ProjectRemoveRule auto pruning may prevent rules from running if mixed conventions are used in a logical plan

2020-06-02 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124143#comment-17124143
 ] 

Botong Huang commented on CALCITE-4029:
---

Exactly, closing this Jira. 

> ProjectRemoveRule auto pruning may prevent rules from running if mixed 
> conventions are used in a logical plan 
> --
>
> Key: CALCITE-4029
> URL: https://issues.apache.org/jira/browse/CALCITE-4029
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Anton Haidai
>Priority: Minor
>
> Preconditions to reproduce the issue:
>  # Logical plan has mixed conventions (for example, a bottom node is a 
> TableScan in a final convention while other nodes are regular logical nodes 
> with NONE convention).
>  # There is a rule that expects a logical node with an input (like a rule 
> matching "operand(LogicalSort.class, operand(RelNode.class, any()))")
>  # A project over the scan is trivial (like SELECT * FROM ...)
> The issue is related to https://issues.apache.org/jira/browse/CALCITE-3939, 
> please see comments for a detailed debugging of a real-life reproducing case.
> h4. Example:
> Logical plan with a leaf nodes in a custom convention:
> {code:java}
> LogicalSort[NONE]
>  LogicalProject[NONE]
>   CustomScan[CUSTOM_CONVENTION]{code}
> A rule configured (RuleX) matches "operand(LogicalSort.class, 
> operand(RelNode.class, any()))".
> *Without ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial an merges it into a 
> single RelSet with CustomScan. 
> RuleX can run on top of this change as far as LogicalProject has a logical 
> node (LogicalProject in RelSubset[NONE]) as an input.
>  
> *With ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial but removes it with 
> it's RelSet so the CustomScan is the only node in it's RelSet, 
> RelSubset[CUSTOM_CONVENTION].
> RuleX can't run on top of this change as far as LogicalProject has an empty 
> input RelSubset[NONE] of the RelSet with the CustomScan.
> h2. Possible workarounds
>  # Disable ProjectRemoveRule auto pruning.
>  # Use only logical nodes in a logical plan, for the example above: use 
> LogicalScan - >  CustomScanRule - > CustomScan instead of direct use of 
> CustomScan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4029) ProjectRemoveRule auto pruning may prevent rules from running if mixed conventions are used in a logical plan

2020-05-31 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120768#comment-17120768
 ] 

Botong Huang commented on CALCITE-4029:
---

This hack I proposed indeed won't work. 
{quote}
If you really want it to fire, consider changing the child operand in RuleX to 
match any convention. (btw, when you write operand(LogicalSort.class, 
operand(RelNode.class, any())) earlier, this should already match the 
customScan with a custom convention right?
{quote}

But again, I insist on this: logical project/sort having a non logical 
CustomScan as input is a weird design. This is the fundamental reason you are 
seeing this issue. 

> ProjectRemoveRule auto pruning may prevent rules from running if mixed 
> conventions are used in a logical plan 
> --
>
> Key: CALCITE-4029
> URL: https://issues.apache.org/jira/browse/CALCITE-4029
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Anton Haidai
>Priority: Minor
>
> Preconditions to reproduce the issue:
>  # Logical plan has mixed conventions (for example, a bottom node is a 
> TableScan in a final convention while other nodes are regular logical nodes 
> with NONE convention).
>  # There is a rule that expects a logical node with an input (like a rule 
> matching "operand(LogicalSort.class, operand(RelNode.class, any()))")
>  # A project over the scan is trivial (like SELECT * FROM ...)
> The issue is related to https://issues.apache.org/jira/browse/CALCITE-3939, 
> please see comments for a detailed debugging of a real-life reproducing case.
> h4. Example:
> Logical plan with a leaf nodes in a custom convention:
> {code:java}
> LogicalSort[NONE]
>  LogicalProject[NONE]
>   CustomScan[CUSTOM_CONVENTION]{code}
> A rule configured (RuleX) matches "operand(LogicalSort.class, 
> operand(RelNode.class, any()))".
> *Without ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial an merges it into a 
> single RelSet with CustomScan. 
> RuleX can run on top of this change as far as LogicalProject has a logical 
> node (LogicalProject in RelSubset[NONE]) as an input.
>  
> *With ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial but removes it with 
> it's RelSet so the CustomScan is the only node in it's RelSet, 
> RelSubset[CUSTOM_CONVENTION].
> RuleX can't run on top of this change as far as LogicalProject has an empty 
> input RelSubset[NONE] of the RelSet with the CustomScan.
> h2. Possible workarounds
>  # Disable ProjectRemoveRule auto pruning.
>  # Use only logical nodes in a logical plan, for the example above: use 
> LogicalScan - >  CustomScanRule - > CustomScan instead of direct use of 
> CustomScan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-05-29 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119765#comment-17119765
 ] 

Botong Huang commented on CALCITE-3939:
---

[~anha] Understood (we are on the same page for Step 2). But still I don't 
think this is a bug.


1. Both before and after the RelSet merge, if there's any other logical relNode 
in the same RelSet as the trivial project. It will get a match and the RuleX 
will fire on it.

2. Logical project/sort having a non logical CustomScan input is a weird 
design. This is the fundamental reason you are seeing this issue. If you really 
want it to fire, consider changing the child operand in RuleX to match any 
convention. (btw, when you write operand(LogicalSort.class, 
operand(RelNode.class, any())) earlier, this should already match the 
customScan with a custom convention right?)

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-05-28 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118941#comment-17118941
 ] 

Botong Huang commented on CALCITE-3939:
---

Thanks [~anha] for reporting the issue. However, as [~hyuan] mentioned, 
something is missing in your scenario. 

Let's say the LogicalSort, TrivialProject, CustomScan are in RelSet 1, 2, 3 
respectively. Here's what it should have happened. 
 # RuleX matches LogicalSort and TrivialProject (RuleX.matches called with 
). This RuleMatch1 gets queued in RuleQueue. 
 # ProjectRemoveRule fires, it marks TrivialProject as stale, triggers RelSet 2 
and 3 to merge, say into RelSet 2. Now both TrivialProject and CustomScan are 
in RelSet 2. 
 # The planner.fireRules in RelSet.mergeWith will trigger a new round of 
RuleMatch for all relNodes in RelSet 2. Specifically RuleX should have a new 
match with LogicalSort and CustomScan (RuleX.matches called with ). This RuleMatch2 is also queued in RuleQueue.  
 # RuleMatch1 pops from ruleQueue, and since TrivialProject is stale, this 
match is skipped and onMatch not called. 
 # RuleMatch2 pops from ruleQueue, and RuleX.onMatch called with . 

Please let us know what went wrong here. 

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3991) the required boolean should always be provided in RelSet.getOrCreateSubset()

2020-05-11 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3991:
-

 Summary: the required boolean should always be provided in 
RelSet.getOrCreateSubset()
 Key: CALCITE-3991
 URL: https://issues.apache.org/jira/browse/CALCITE-3991
 Project: Calcite
  Issue Type: Bug
Reporter: Botong Huang


the required boolean should always be provided in RelSet.getOrCreateSubset(). 
Deleting the old default as well as other related code cleanup



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3961) VolcanoPlanner.prunedNodes information is lost when duplicate relNode is discarded

2020-05-09 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103628#comment-17103628
 ] 

Botong Huang commented on CALCITE-3961:
---

Thanks [~hyuan] for the review!

> VolcanoPlanner.prunedNodes information is lost when duplicate relNode is 
> discarded
> --
>
> Key: CALCITE-3961
> URL: https://issues.apache.org/jira/browse/CALCITE-3961
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> VolcanoPlanner.prunedNodes stores the list of relNodes that are marked 
> useless. Whenever the planner see two identical relNode (e.g. when Relsets 
> are merged), one of them are discarded. However, when the preserved node is 
> not in the pruned list while the discarded one is, this pruned information is 
> lost. In general, we should preserve this info whenever duplicate relNodes 
> are discarded. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3981:
--
Description: 
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset (see AbstractRelNode.onRegister()). This 
is a leak because once a relSet/subset is merged into others and becomes stale, 
it should not be used to connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the registering subset itself, register should always use 
canonize() to find the equivalent active subset and return it instead.

  was:
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset (see AbstractRelNode.onRegister()). This 
is a leak because once a relSet/subset is merged into others and becomes stale, 
it should not be used to connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.


> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the registering subset itself, register should always 
> use canonize() to find the equivalent active subset and return it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3981:
--
Description: 
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset (see AbstractRelNode.onRegister()). This 
is a leak because once a relSet/subset is merged into others and becomes stale, 
it should not be used to connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the registering subset itself, register should always use 
canonize() to find and return the equivalent active subset instead.

  was:
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset (see AbstractRelNode.onRegister()). This 
is a leak because once a relSet/subset is merged into others and becomes stale, 
it should not be used to connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the registering subset itself, register should always use 
canonize() to find the equivalent active subset and return it instead.


> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the registering subset itself, register should always 
> use canonize() to find and return the equivalent active subset instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3981:
--
Description: 
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset (see AbstractRelNode.onRegister()). This 
is a leak because once a relSet/subset is merged into others and becomes stale, 
it should not be used to connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.

  was:
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. This is a leak because once a 
relSet/subset is merged into others and becomes stale, it should not be used to 
connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.


> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the subset as it, register should always use canonize() 
> to find the equivalent active subset and return it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3981:
--
Description: 
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in ensureRegistered() and registerSubset() 
"merge(set, subset.set)"). As a result, a stale/merged subset might be returned 
from registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. This is a leak because once a 
relSet/subset is merged into others and becomes stale, it should not be used to 
connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.

  was:
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in registerSubset() "merge(set, 
subset.set)"). As a result, a stale/merged subset might be returned from 
registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. This is a leak because once a 
relSet/subset is merged into others and becomes stale, it should not be used to 
connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.


> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset. This is a leak 
> because once a relSet/subset is merged into others and becomes stale, it 
> should not be used to connect new relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the subset as it, register should always use canonize() 
> to find the equivalent active subset and return it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3981:
--
Description: 
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in registerSubset() "merge(set, 
subset.set)"). As a result, a stale/merged subset might be returned from 
registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. This is a leak because once a 
relSet/subset is merged into others and becomes stale, it should not be used to 
connect new relNodes. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.

  was:
When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in registerSubset() "merge(set, 
subset.set)"). As a result, a stale/merged subset might be returned from 
registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.


> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in registerSubset() "merge(set, 
> subset.set)"). As a result, a stale/merged subset might be returned from 
> registerImpl, and the newly registering subtree might get registered 
> recursively on top of the stale subset. This is a leak because once a 
> relSet/subset is merged into others and becomes stale, it should not be used 
> to connect new relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the subset as it, register should always use canonize() 
> to find the equivalent active subset and return it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-08 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3981:
-

 Summary: Volcano.register should not return stale/merged subset
 Key: CALCITE-3981
 URL: https://issues.apache.org/jira/browse/CALCITE-3981
 Project: Calcite
  Issue Type: Bug
Reporter: Botong Huang


When a subset is registered, registerImpl() and registerSubset() currently 
simply returns the subset itself. The problem is that subset can become stale 
when relSets get merged (for example in registerSubset() "merge(set, 
subset.set)"). As a result, a stale/merged subset might be returned from 
registerImpl, and the newly registering subtree might get registered 
recursively on top of the stale subset. 

With CALCITE-3755, subsets can now be directly matched by rules. This opens 
another source of stale subset leak: (1) An active subset gets matched, the 
RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to relSet 
merge. (3) The rule match in (1) is popped from queue and fired. (4) In OnMatch 
the rule gets the stale subset, builds new rels on top of it and regsiter the 
new rels. In this case, the entire new rel subtree will be registered on top of 
the stale subset as is.

Rather than returning the subset as it, register should always use canonize() 
to find the equivalent active subset and return it instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

2020-04-27 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093848#comment-17093848
 ] 

Botong Huang commented on CALCITE-3947:
---

Thanks [~julianhyde] for the detailed comment! Let me elaborate a bit and see 
if it makes sense to you. 

Firstly, the point of this patch is to make sure the behavior of the planner is 
identical across runs for easy debugging. Existing code did try to fulfill this 
purpose, that's why lots of LinkedHashSet/Map are used in the planner already. 
I simply found a leak in AbstractRelOptPlanner.classes, now a HashSet, whose 
iteration order does impact the order rules are matched and fired (see 
description for details), and thus making the behavior underterministic. 

Secondly, yes keeping it sorted, say with TreeSet, serves the purpose as well. 
Considering TreeSet is more costly than LinkedHashSet, and that currently 
there's no additional benefit of keeping it sorted for now, I think 
LinkedHashSet is a better choice.  

> AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match 
> order is deterministic across runs
> ---
>
> Key: CALCITE-3947
> URL: https://issues.apache.org/jira/browse/CALCITE-3947
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> AbstractRelOptPlanner.classes is used by subClasses() to determine things to 
> put into VolcanoPlanner.classOperands, which is then used in 
> VolcanoPlanner.fireRules(). Since AbstractRelOptPlanner.classes is now a 
> HashSet, its iteration order is not deterministic across runs, making 
> debugging hard. It should be LinkedHashSet just like many other fields in the 
> planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3961) VolcanoPlanner.prunedNodes information is lost when duplicate relNode is discarded

2020-04-26 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3961:
-

 Summary: VolcanoPlanner.prunedNodes information is lost when 
duplicate relNode is discarded
 Key: CALCITE-3961
 URL: https://issues.apache.org/jira/browse/CALCITE-3961
 Project: Calcite
  Issue Type: Bug
Reporter: Botong Huang


VolcanoPlanner.prunedNodes stores the list of relNodes that are marked useless. 
Whenever the planner see two identical relNode (e.g. when Relsets are merged), 
one of them are discarded. However, when the preserved node is not in the 
pruned list while the discarded one is, this pruned information is lost. In 
general, we should preserve this info whenever duplicate relNodes are 
discarded. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall

2020-04-22 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090103#comment-17090103
 ] 

Botong Huang commented on CALCITE-3948:
---

Thanks [~hyuan] for the review!

> Improve operand's RelSubset matching handling in VolcanoRuleCall
> 
>
> Key: CALCITE-3948
> URL: https://issues.apache.org/jira/browse/CALCITE-3948
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> For operands matching for a RelSubset, more handling under various cases are 
> needed to be consistent in VolcanoRuleCall



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

2020-04-21 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089093#comment-17089093
 ] 

Botong Huang commented on CALCITE-3947:
---

Thanks [~hyuan] and [~julianhyde] for the quick review!

> AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match 
> order is deterministic across runs
> ---
>
> Key: CALCITE-3947
> URL: https://issues.apache.org/jira/browse/CALCITE-3947
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
> Fix For: 1.23.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> AbstractRelOptPlanner.classes is used by subClasses() to determine things to 
> put into VolcanoPlanner.classOperands, which is then used in 
> VolcanoPlanner.fireRules(). Since AbstractRelOptPlanner.classes is now a 
> HashSet, its iteration order is not deterministic across runs, making 
> debugging hard. It should be LinkedHashSet just like many other fields in the 
> planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

2020-04-21 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089083#comment-17089083
 ] 

Botong Huang commented on CALCITE-3947:
---

Yes it will. Since the iteration order changes across runs, the order of rule 
matched and thus fired are different. 

> AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match 
> order is deterministic across runs
> ---
>
> Key: CALCITE-3947
> URL: https://issues.apache.org/jira/browse/CALCITE-3947
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> AbstractRelOptPlanner.classes is used by subClasses() to determine things to 
> put into VolcanoPlanner.classOperands, which is then used in 
> VolcanoPlanner.fireRules(). Since AbstractRelOptPlanner.classes is now a 
> HashSet, its iteration order is not deterministic across runs, making 
> debugging hard. It should be LinkedHashSet just like many other fields in the 
> planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall

2020-04-21 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3948:
-

 Summary: Improve operand's RelSubset matching handling in 
VolcanoRuleCall
 Key: CALCITE-3948
 URL: https://issues.apache.org/jira/browse/CALCITE-3948
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


For operands matching for a RelSubset, more handling under various cases are 
needed to be consistent in VolcanoRuleCall



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

2020-04-21 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3947:
-

 Summary: AbstractRelOptPlanner.classes should be LinkedHashSet so 
that rule match order is deterministic across runs
 Key: CALCITE-3947
 URL: https://issues.apache.org/jira/browse/CALCITE-3947
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


AbstractRelOptPlanner.classes is used by subClasses() to determine things to 
put into VolcanoPlanner.classOperands, which is then used in 
VolcanoPlanner.fireRules(). Since AbstractRelOptPlanner.classes is now a 
HashSet, its iteration order is not deterministic across runs, making debugging 
hard. It should be LinkedHashSet just like many other fields in the planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3927) RelSubset is not fired for rule when set gets merged

2020-04-21 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088982#comment-17088982
 ] 

Botong Huang commented on CALCITE-3927:
---

Thanks [~hyuan] and everyone for the review and comments!

> RelSubset is not fired for rule when set gets merged
> 
>
> Key: CALCITE-3927
> URL: https://issues.apache.org/jira/browse/CALCITE-3927
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In VolcanoPlanner, when set gets merged, planner fires rules again for 
> RelNodes in both sets, but not for RelSubset. We might miss something because 
> of this. 
> If all the logical transformation rules and physical implementation rules are 
> separated out in different stage and physical rules don't do logical work, we 
> might be OK. But the reality is that all the things are mixed together at the 
> moment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-3927) RelSubset is not fired for rule when set gets merged

2020-04-21 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088895#comment-17088895
 ] 

Botong Huang edited comment on CALCITE-3927 at 4/21/20, 5:35 PM:
-

The unit test gives an example of what can happen in real case, where 
PhysSingleSubsetRule is a physical rule that has an operand matching for 
RelSubset. When two RelSets having none identical physical traits get merged 
(e.g. PHYS_CALLING_CONVENTION3 that is in the new RelSet but not in the 
original RelSet), the RelSubset with the new physical trait that gets merged in 
will never trigger the rule match again, but it should. 

In reality, the two physical trait PHYS_CALLING_CONVENTION and 
PHYS_CALLING_CONVENTION3 with the satisfy relationship can be something like 
"any" and "hash[0]"


was (Author: botong):
The unit test gives an example of what can happen in real case, where 
PhysSingleSubsetRule is a physical rule that has an operand matching for 
RelSubset. When two RelSets having none identical physical traits get merged 
(e.g. PHYS_CALLING_CONVENTION3 that is in the new RelSet but not in the 
original RelSet), the RelSubset with the new physical trait that gets merged in 
will never trigger the rule match again, but it should. 

> RelSubset is not fired for rule when set gets merged
> 
>
> Key: CALCITE-3927
> URL: https://issues.apache.org/jira/browse/CALCITE-3927
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In VolcanoPlanner, when set gets merged, planner fires rules again for 
> RelNodes in both sets, but not for RelSubset. We might miss something because 
> of this. 
> If all the logical transformation rules and physical implementation rules are 
> separated out in different stage and physical rules don't do logical work, we 
> might be OK. But the reality is that all the things are mixed together at the 
> moment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3927) RelSubset is not fired for rule when set gets merged

2020-04-21 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088895#comment-17088895
 ] 

Botong Huang commented on CALCITE-3927:
---

The unit test gives an example of what can happen in real case, where 
PhysSingleSubsetRule is a physical rule that has an operand matching for 
RelSubset. When two RelSets having none identical physical traits get merged 
(e.g. PHYS_CALLING_CONVENTION3 that is in the new RelSet but not in the 
original RelSet), the RelSubset with the new physical trait that gets merged in 
will never trigger the rule match again, but it should. 

> RelSubset is not fired for rule when set gets merged
> 
>
> Key: CALCITE-3927
> URL: https://issues.apache.org/jira/browse/CALCITE-3927
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In VolcanoPlanner, when set gets merged, planner fires rules again for 
> RelNodes in both sets, but not for RelSubset. We might miss something because 
> of this. 
> If all the logical transformation rules and physical implementation rules are 
> separated out in different stage and physical rules don't do logical work, we 
> might be OK. But the reality is that all the things are mixed together at the 
> moment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-04-20 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087987#comment-17087987
 ] 

Botong Huang commented on CALCITE-3939:
---

Title changed, please let me know if it is clearer, thx!

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-04-20 Thread Botong Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3939:
--
Summary: Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
SubstitutionRule  (was: more auto pruning rules after SubstitutionRule is 
introduced)

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3939) more auto pruning rules after SubstitutionRule is introduced

2020-04-19 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3939:
-

 Summary: more auto pruning rules after SubstitutionRule is 
introduced
 Key: CALCITE-3939
 URL: https://issues.apache.org/jira/browse/CALCITE-3939
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a RelNode. 
They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3927) RelSubset is not fired for rule when set gets merged

2020-04-16 Thread Botong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085120#comment-17085120
 ] 

Botong Huang commented on CALCITE-3927:
---

Uploaded a patch with test case, please take a look. 

> RelSubset is not fired for rule when set gets merged
> 
>
> Key: CALCITE-3927
> URL: https://issues.apache.org/jira/browse/CALCITE-3927
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In VolcanoPlanner, when set gets merged, planner fires rules again for 
> RelNodes in both sets, but not for RelSubset. We might miss something because 
> of this. 
> If all the logical transformation rules and physical implementation rules are 
> separated out in different stage and physical rules don't do logical work, we 
> might be OK. But the reality is that all the things are mixed together at the 
> moment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3227) IndexOutOfBound when checking candidate parent match's input ordinal in VolcanoRuleCall

2019-08-03 Thread Botong Huang (JIRA)

Botong Huang created CALCITE-3227:
-

 Summary: IndexOutOfBound when checking candidate parent match's 
input ordinal in VolcanoRuleCall
 Key: CALCITE-3227
 URL: https://issues.apache.org/jira/browse/CALCITE-3227
 Project: Calcite
  Issue Type: Bug
Reporter: Botong Huang


In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the candidate parent 
relNode indeed has the previously matched relNode as a child with the right 
ordinal. However, some candidate parent can have less number of inputs than the 
parent operand, and thus we hit IndexOutOfBound when trying to grab the correct 
child for checking. 

In the added unit test that repro the bug, we have a union with two inputs of 
class PhysLeafRel. The rule however, matches a union with three inputs, with 
the third child operand matching for PhysLeafRel.class. When a child relNode 
gets matched to the third child operand, we go up trying to see whether the 
union relNode can match the parent. Trying to access the union's third input 
hits the IndexOutOfBound error because the union only has two inputs. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (CALCITE-3118) VolcanoRuleCall should look at RelSubset rather than RelSet when checking child ordinal of a parent operand

2019-07-21 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3118:
--
Summary: VolcanoRuleCall should look at RelSubset rather than RelSet when 
checking child ordinal of a parent operand  (was: Properly check child ordinal 
when matching parent operand in VolcanoRuleCall)

> VolcanoRuleCall should look at RelSubset rather than RelSet when checking 
> child ordinal of a parent operand
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched RelNode as a child *with the* *expected 
> child ordinal*.
> However, there is a bug in this child ordinal check:
> {noformat}
> 333if (ascending && operand.childPolicy != 
> RelOptRuleOperandChildPolicy.UNORDERED) {
> 334  // We know that the previous operand was *a* child of its parent,
> 335  // but now check that it is the *correct* child.
> 336  final RelSubset input =
> 337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
> 338  List inputRels = input.set.getRelsFromAllSubsets();
> 339  if (!inputRels.contains(previous)) {
> 340continue;
> 341  }
> 342}
> {noformat}
> We intend to make sure that "previous" is in Subset "input". However line 338 
> looked at RelNodes from the entire RelSet, rather than RelNodes only in 
> Subset "input". As a result, in some cases, incorrect parent is not skipped 
> as expected and is matched incorrectly.
> The unit test case included is a case that triggers this bug, where a second 
> child RelNode incorrectly get matched as the first child of the parent 
> RelNode.
> --
>  Here's a detailed explanation of the test case that triggers the bug
> We constructed a RelNode structure:
> {noformat}
>  PhysBiRel
>   / \
>   Subset1   Subset2
> |  |
> leftPhyrightPhy
> {noformat}
> Where PhysBiRel has two inputs: leftPhy and rightPhy, both are logically 
> equivalent, but with different traits (Convention in this test case), and 
> thus are in different subsets. 
>  (Two Rels in two subsets in the same RelSet is a necessary condition to 
> trigger this bug. )
> A rule AssertOperandsDifferentRule is constructed as follows:
> {noformat}
> operand(PhysBiRel.class,
> operand(PhysLeafRel.class, any()),
> operand(PhysLeafRel.class, any()))
> {noformat}
> Obviously the correct match would be:
> {noformat}
>  PhysBiRel
>   / \
> leftPhyrightPhy
> {noformat}
> However, with the bug, another match is also returned:
> {noformat}
>  PhysBiRel
>   / \
> rightPhyrightPhy
> {noformat}
> *This is wrong because rightPhy is not PhysBiRel's first input at all, and 
> should not match as parent operand's first child.*
> 
>  Here's how the incorrect match occurs. 
>  1. When rightPhy of class PhysLeafRel is registered, we attempt to match it 
> to both the left and right child operands of AssertOperandsDifferentRule 
> above. This is expected. 
>  2. When matched to the right child operand, it eventually leads to the 
> correct match result above. 
>  3. When matched to the left child operand, it should have skipped/failed 
> matching the parent operand to PhysBiRel because rightPhy is *NOT* 
> PhysBiRel's first input. But because of the bug, the match succeeded. After 
> parent is matched, then it attempt to match the right child operand, and 
> again matched the rightPhy. As a result, both child operand end up matching 
> the same RelNode rightPhy.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (CALCITE-3118) Properly check child ordinal when matching parent operand in VolcanoRuleCall

2019-07-21 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3118:
--
Summary: Properly check child ordinal when matching parent operand in 
VolcanoRuleCall  (was: VolcanoRuleCall match parent child ordinal not properly 
checked)

> Properly check child ordinal when matching parent operand in VolcanoRuleCall
> 
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched RelNode as a child *with the* *expected 
> child ordinal*.
> However, there is a bug in this child ordinal check:
> {noformat}
> 333if (ascending && operand.childPolicy != 
> RelOptRuleOperandChildPolicy.UNORDERED) {
> 334  // We know that the previous operand was *a* child of its parent,
> 335  // but now check that it is the *correct* child.
> 336  final RelSubset input =
> 337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
> 338  List inputRels = input.set.getRelsFromAllSubsets();
> 339  if (!inputRels.contains(previous)) {
> 340continue;
> 341  }
> 342}
> {noformat}
> We intend to make sure that "previous" is in Subset "input". However line 338 
> looked at RelNodes from the entire RelSet, rather than RelNodes only in 
> Subset "input". As a result, in some cases, incorrect parent is not skipped 
> as expected and is matched incorrectly.
> The unit test case included is a case that triggers this bug, where a second 
> child RelNode incorrectly get matched as the first child of the parent 
> RelNode.
> --
>  Here's a detailed explanation of the test case that triggers the bug
> We constructed a RelNode structure:
> {noformat}
>  PhysBiRel
>   / \
>   Subset1   Subset2
> |  |
> leftPhyrightPhy
> {noformat}
> Where PhysBiRel has two inputs: leftPhy and rightPhy, both are logically 
> equivalent, but with different traits (Convention in this test case), and 
> thus are in different subsets. 
>  (Two Rels in two subsets in the same RelSet is a necessary condition to 
> trigger this bug. )
> A rule AssertOperandsDifferentRule is constructed as follows:
> {noformat}
> operand(PhysBiRel.class,
> operand(PhysLeafRel.class, any()),
> operand(PhysLeafRel.class, any()))
> {noformat}
> Obviously the correct match would be:
> {noformat}
>  PhysBiRel
>   / \
> leftPhyrightPhy
> {noformat}
> However, with the bug, another match is also returned:
> {noformat}
>  PhysBiRel
>   / \
> rightPhyrightPhy
> {noformat}
> *This is wrong because rightPhy is not PhysBiRel's first input at all, and 
> should not match as parent operand's first child.*
> 
>  Here's how the incorrect match occurs. 
>  1. When rightPhy of class PhysLeafRel is registered, we attempt to match it 
> to both the left and right child operands of AssertOperandsDifferentRule 
> above. This is expected. 
>  2. When matched to the right child operand, it eventually leads to the 
> correct match result above. 
>  3. When matched to the left child operand, it should have skipped/failed 
> matching the parent operand to PhysBiRel because rightPhy is *NOT* 
> PhysBiRel's first input. But because of the bug, the match succeeded. After 
> parent is matched, then it attempt to match the right child operand, and 
> again matched the rightPhy. As a result, both child operand end up matching 
> the same RelNode rightPhy.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-08 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3118:
--
Description: 
In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the matched parent 
indeed has the previously matched RelNode as a child *with the* *expected child 
ordinal*.

However, there is a bug in this child ordinal check:
{noformat}
333if (ascending && operand.childPolicy != 
RelOptRuleOperandChildPolicy.UNORDERED) {
334  // We know that the previous operand was *a* child of its parent,
335  // but now check that it is the *correct* child.
336  final RelSubset input =
337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
338  List inputRels = input.set.getRelsFromAllSubsets();
339  if (!inputRels.contains(previous)) {
340continue;
341  }
342}
{noformat}
We intend to make sure that "previous" is in Subset "input". However line 338 
looked at RelNodes from the entire RelSet, rather than RelNodes only in Subset 
"input". As a result, in some cases, incorrect parent is not skipped as 
expected and is matched incorrectly.

The unit test case included is a case that triggers this bug, where a second 
child RelNode incorrectly get matched as the first child of the parent RelNode.

--
 Here's a detailed explanation of the test case that triggers the bug

We constructed a RelNode structure:
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}
Where PhysBiRel has two inputs: leftPhy and rightPhy, both are logically 
equivalent, but with different traits (Convention in this test case), and thus 
are in different subsets. 
 (Two Rels in two subsets in the same RelSet is a necessary condition to 
trigger this bug. )

A rule AssertOperandsDifferentRule is constructed as follows:
{noformat}
operand(PhysBiRel.class,
operand(PhysLeafRel.class, any()),
operand(PhysLeafRel.class, any()))
{noformat}
Obviously the correct match would be:
{noformat}
 PhysBiRel
  / \
leftPhyrightPhy
{noformat}
However, with the bug, another match is also returned:
{noformat}
 PhysBiRel
  / \
rightPhyrightPhy
{noformat}
*This is wrong because rightPhy is not PhysBiRel's first input at all, and 
should not match as parent operand's first child.*


 Here's how the incorrect match occurs. 
 1. When rightPhy of class PhysLeafRel is registered, we attempt to match it to 
both the left and right child operands of AssertOperandsDifferentRule above. 
This is expected. 
 2. When matched to the right child operand, it eventually leads to the correct 
match result above. 
 3. When matched to the left child operand, it should have skipped/failed 
matching the parent operand to PhysBiRel because rightPhy is *NOT* PhysBiRel's 
first input. But because of the bug, the match succeeded. After parent is 
matched, then it attempt to match the right child operand, and again matched 
the rightPhy. As a result, both child operand end up matching the same RelNode 
rightPhy.

  was:
In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the matched parent 
indeed has the previously matched RelNode as a child *with the* *expected child 
ordinal*.

However, there is a bug in this child ordinal check:
{noformat}
333if (ascending && operand.childPolicy != 
RelOptRuleOperandChildPolicy.UNORDERED) {
334  // We know that the previous operand was *a* child of its parent,
335  // but now check that it is the *correct* child.
336  final RelSubset input =
337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
338  List inputRels = input.set.getRelsFromAllSubsets();
339  if (!inputRels.contains(previous)) {
340continue;
341  }
342}
{noformat}
We intend to make sure that "previous" is in Subset "input". However line 338 
looked at RelNodes from the entire RelSet, rather than RelNodes only in Subset 
"input". As a result, in some cases, incorrect parent is not skipped as 
expected and matched incorrectly.


 The unit test case included is a case that triggers this bug, where a second 
child RelNode incorrectly get matched as the first child of the parent RelNode.

--
 Here's a detailed explanation of the test case that triggers the bug

We constructed a RelNode structure:
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}
Where PhysBiRel has two inputs: leftPhy and rightPhy, both are logically 
equivalent, but with different traits (Convention in this test case), and thus 
are in different subsets.

[jira] [Updated] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-08 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3118:
--
Description: 
In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the matched parent 
indeed has the previously matched RelNode as a child *with the* *expected child 
ordinal*.

However, there is a bug in this child ordinal check:
{noformat}
333if (ascending && operand.childPolicy != 
RelOptRuleOperandChildPolicy.UNORDERED) {
334  // We know that the previous operand was *a* child of its parent,
335  // but now check that it is the *correct* child.
336  final RelSubset input =
337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
338  List inputRels = input.set.getRelsFromAllSubsets();
339  if (!inputRels.contains(previous)) {
340continue;
341  }
342}
{noformat}
We intend to make sure that "previous" is in Subset "input". However line 338 
looked at RelNodes from the entire RelSet, rather than RelNodes only in Subset 
"input". As a result, in some cases, incorrect parent is not skipped as 
expected and matched incorrectly.


 The unit test case included is a case that triggers this bug, where a second 
child RelNode incorrectly get matched as the first child of the parent RelNode.

--
 Here's a detailed explanation of the test case that triggers the bug

We constructed a RelNode structure:
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}
Where PhysBiRel has two inputs: leftPhy and rightPhy, both are logically 
equivalent, but with different traits (Convention in this test case), and thus 
are in different subsets. 
 (Two Rels in two subsets in the same RelSet is a necessary condition to 
trigger this bug. )

A rule AssertOperandsDifferentRule is constructed as follows:
{noformat}
operand(PhysBiRel.class,
operand(PhysLeafRel.class, any()),
operand(PhysLeafRel.class, any()))
{noformat}
Obviously the correct match would be:
{noformat}
 PhysBiRel
  / \
leftPhyrightPhy
{noformat}
However, with the bug, another match is also returned:
{noformat}
 PhysBiRel
  / \
rightPhyrightPhy
{noformat}
*This is wrong because rightPhy is not PhysBiRel's first input at all, and 
should not match as parent operand's first child.*


 Here's how the incorrect match occurs. 
 1. When rightPhy is registered, we attempt to match it to both the left and 
right child operand of AssertOperandsDifferentRule above. 
 2. When matched to the right child operand, it eventually leads to the correct 
match result above. 
 3. When matched to the left child operand, it should have skipped/failed 
matching the parent operand to PhysBiRel because rightPhy is not!!! 
PhysBiRel's first input. 
 But because of the bug, the match succeeded. After parent is matched, then it 
attempt to match the right child operand, and again matched the rightPhy as 
expected. 
 As a result, both child operand end up matching the same RelNode rightPhy.

  was:In VolcanoRuleCall.matchRecurse(), when ascending (child operand is 
matched, looking for parent operand match), we want to check that the matched 
parent indeed has the previously matched child RelNode as a child with the 
expected ordinal. However, there is a bug in this check. As a result, some 
incorrect parent is not skipped as expected and matched incorrectly. See unit 
test included in PR for a case that triggers this bug, where a second child 
RelNode get matched as first child of the parent RelNode. 


> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched RelNode as a child *with the* *expected 
> child ordinal*.
> However, there is a bug in this child ordinal check:
> {noformat}
> 333if (ascending && operand.childPolicy != 
> RelOptRuleOperandChildPolicy.UNORDERED) {
> 334  // We know that the previous operand was *a* child of its parent,
> 335  // but now check that it is the *correct* child.
> 336  final RelSubset input =
> 337  (RelSubset) rel.getInput(previousOperand.ordinalInParent);
> 338  List inputRels = input.set

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-08 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859312#comment-16859312
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/8/19 9:29 PM:
---

{quote}
Firstly you construct a RelNode tree like what the pattern describes, then you 
register that the `leftPhy` and `rightPhy` are equivalent. It's not that 
surprising you got a tree like:

PhysBiRel
/   \
rightPhy  rightPhy
This tree also satisfied AssertOperandsDifferentRule patttern. 
{quote}

leftPhy and rightPhy are indeed logically equivalent, but they have different 
traits (Convention in this test case), and thus are in different subsets. 

The actual Rel structure is: 
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}

Two Rels in two subsets in the same RelSet is a necessary condition to trigger 
this bug. 

The reason that I think rightPhy should not match the left child operand is 
because of rightPhy's child ordinal in parent. When AssertOperandsDifferentRule 
says 
{noformat}
operand(PhysBiRel.class,
operand(PhysLeafRel.class, any()),
operand(PhysLeafRel.class, any()))
{noformat}
it implies RelOptRuleOperandChildPolicy.SOME, whose javadoc says "Signifies 
that the operand's children must precisely match its child operands, in order."

rightPhy is PhysBiRel's second input, and it should not match as the first 
child of the parent operand. It violates the "precisely match in order" part. 
In fact, rightPhy is not PhysBiRel's first input at all. I really don't think 
this is a valid match. 

Enforcing child ordinal is exactly what line 334-341 in VolcanoRuleCall is 
trying to do, as the code comment says: 
{noformat}
// We know that the previous operand was *a* child of its parent,
// but now check that it is the *correct* child.
{noformat}
Except that in this unit test's case, the code failed to detect and skip 
(continue at line 340) this wrong match because it looked at all Rels in the 
RelSet instead of looking at the Rels in the Subset only. This is why I insist 
this is a bug. Hopefully it is clearer now. 


was (Author: botong):

{quote}
Firstly you construct a RelNode tree like what the pattern describes, then you 
register that the `leftPhy` and `rightPhy` are equivalent. It's not that 
surprising you got a tree like:

PhysBiRel
/   \
rightPhy  rightPhy
This tree also satisfied AssertOperandsDifferentRule patttern. 
{quote}

leftPhy and rightPhy are indeed logically equivalent, but they have different 
traits (Convention in this test case), and thus are in different subsets. 

The actual Rel structure is: 
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}

Two Rels in two subsets in the same RelSet is a necessary condition to trigger 
this bug. 

The reason that I think rightPhy should not match the left child operand is 
because of rightPhy's child ordinal in parent. When AssertOperandsDifferentRule 
says 
{noformat}
operand(PhysBiRel.class,
operand(PhysLeafRel.class, any()),
operand(PhysLeafRel.class, any()))
{noformat}
it implies RelOptRuleOperandChildPolicy.SOME, whose javadoc says "Signifies 
that the operand's children must precisely match its child operands, in order."

rightPhy is PhysBiRel's second input, and it should not match as the first 
child of the parent operand. It violates the "precisely match in order" part. 

Enforcing child ordinal is exactly what line 334-341 in VolcanoRuleCall is 
trying to do, as the code comment says: 
{noformat}
// We know that the previous operand was *a* child of its parent,
// but now check that it is the *correct* child.
{noformat}
Except that in this unit test's case, the code failed to detect and skip 
(continue at line 340) this wrong match because it looked at all Rels in the 
RelSet instead of looking at the Rels in the Subset only. This is why I insist 
this is a bug. Hopefully it is clearer now. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case t

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-08 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859312#comment-16859312
 ] 

Botong Huang commented on CALCITE-3118:
---


{quote}
Firstly you construct a RelNode tree like what the pattern describes, then you 
register that the `leftPhy` and `rightPhy` are equivalent. It's not that 
surprising you got a tree like:

PhysBiRel
/   \
rightPhy  rightPhy
This tree also satisfied AssertOperandsDifferentRule patttern. 
{quote}

leftPhy and rightPhy are indeed logically equivalent, but they have different 
traits (Convention in this test case), and thus are in different subsets. 

The actual Rel structure is: 
{noformat}
 PhysBiRel
  / \
  Subset1   Subset2
|  |
leftPhyrightPhy
{noformat}

Two Rels in two subsets in the same RelSet is a necessary condition to trigger 
this bug. 

The reason that I think rightPhy should not match the left child operand is 
because of rightPhy's child ordinal in parent. When AssertOperandsDifferentRule 
says 
{noformat}
operand(PhysBiRel.class,
operand(PhysLeafRel.class, any()),
operand(PhysLeafRel.class, any()))
{noformat}
it implies RelOptRuleOperandChildPolicy.SOME, whose javadoc says "Signifies 
that the operand's children must precisely match its child operands, in order."

rightPhy is PhysBiRel's second input, and it should not match as the first 
child of the parent operand. It violates the "precisely match in order" part. 

Enforcing child ordinal is exactly what line 334-341 in VolcanoRuleCall is 
trying to do, as the code comment says: 
{noformat}
// We know that the previous operand was *a* child of its parent,
// but now check that it is the *correct* child.
{noformat}
Except that in this unit test's case, the code failed to detect and skip 
(continue at line 340) this wrong match because it looked at all Rels in the 
RelSet instead of looking at the Rels in the Subset only. This is why I insist 
this is a bug. Hopefully it is clearer now. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where a second child 
> RelNode get matched as first child of the parent RelNode. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858693#comment-16858693
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/8/19 4:13 AM:
---

{quote}It is not wrong provided the children are identical
{quote}
The leftPhy and rightPhy are logically equivalent, but has different traits and 
are in different subsets. 

I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
      A
    /    \
  B     C
 where *B is A's first child*. What the match returned was: 
        Rel1
      /        \
 Rel2.1    Rel2.2
 where Rel2.1 is *NOT* Rel1's first child. 


was (Author: botong):
{quote}It is not wrong provided the children are identical
{quote}
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
      A
    /    \
  B     C
 where *B is A's first child*. What the match returned was: 
        Rel1
      /        \
 Rel2.1    Rel2.2
 where Rel2.1 is *NOT* Rel1's first child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where a second child 
> RelNode get matched as first child of the parent RelNode. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated CALCITE-3118:
--
Description: In VolcanoRuleCall.matchRecurse(), when ascending (child 
operand is matched, looking for parent operand match), we want to check that 
the matched parent indeed has the previously matched child RelNode as a child 
with the expected ordinal. However, there is a bug in this check. As a result, 
some incorrect parent is not skipped as expected and matched incorrectly. See 
unit test included in PR for a case that triggers this bug, where a second 
child RelNode get matched as first child of the parent RelNode.   (was: In 
VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the matched parent 
indeed has the previously matched child RelNode as a child with the expected 
ordinal. However, there is a bug in this check. As a result, some incorrect 
parent is not skipped as expected and matched incorrectly. See unit test 
included in PR for a case that triggers this bug, where the same RelNode get 
matched for two operands at the same time. )

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where a second child 
> RelNode get matched as first child of the parent RelNode. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858699#comment-16858699
 ] 

Botong Huang commented on CALCITE-3118:
---

bq. Or it might happen you could rewrite the test-case in easy to understand 
way.
The unit test is indeed involved because of the tricky condition of the bug. 
However I don't know if there's an easier way to trigger it... At least I hope 
you agree with the one line fix...


> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858178#comment-16858178
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 2:24 PM:
---

Agree in general. But in this test case, the rule in this unit test 
(AssertOperandsDifferentRule) is to match a parent and its first and second 
child, it is wrong to return a match with a parent relNode and its second child 
twice.


was (Author: botong):
Agree in general. But in this test case, the rule in this unit test 
(AssertOperandsDifferentRule) is to match a parent and its first and second 
child, it is wrong to return a match with a parent relNode and both its second 
child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858693#comment-16858693
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 2:20 PM:
---

{quote}It is not wrong provided the children are identical
{quote}
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
      A
    /    \
  B     C
 where *B is A's first child*. What the match returned was: 
        Rel1
      /        \
 Rel2.1    Rel2.2
 where Rel2.1 is *NOT* Rel1's first child. 


was (Author: botong):
{quote}It is not wrong provided the children are identical
{quote}
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
      A
    /    \
  B     C
 where *B is A's first child*. What the match returned was: 
        Rel1
      /        \
 Rel2.1    Rel2.2
 where Rel2.1 is *NOT* A's first child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858693#comment-16858693
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 2:19 PM:
---

{quote}It is not wrong provided the children are identical
{quote}
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
      A
    /    \
  B     C
 where *B is A's first child*. What the match returned was: 
        Rel1
      /        \
 Rel2.1    Rel2.2
 where Rel2.1 is *NOT* A's first child. 


was (Author: botong):
 bq. It is not wrong provided the children are identical
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
     A
   /    \
 B     C
where *B is A's first child*. What the match returned was: 
       Rel1
     /        \
 Rel2.1    Rel2.2
where Rel2.1 is *NOT* A's first child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858693#comment-16858693
 ] 

Botong Huang commented on CALCITE-3118:
---

 bq. It is not wrong provided the children are identical
I think you missed my point. The structure the _AssertOperandsDifferentRule_ is 
looking for is: 
     A
   /    \
 B     C
where *B is A's first child*. What the match returned was: 
       Rel1
     /        \
 Rel2.1    Rel2.2
where Rel2.1 is *NOT* A's first child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858653#comment-16858653
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 1:39 PM:
---

Thanks [~danny0405] for reviewing! Having two relNodes of the same class in two 
RelSubsets in the same RelSet (thus the equivalent line) is the necessary 
condition to trigger this bug. However, using convention in the unit test is 
not a must. It's merely to produce the two subSets. Any two traits that are 
different would suffice.

Also as in AssertOperandsDifferentRule, it is not trying to match any 
convention. We are simply trying to match a parent with its first and second 
children being PhysLeafRel.class, it is wrong to return a match with a parent 
relNode and its second children twice.


was (Author: botong):
Thanks [~danny0405] for reviewing! Having two relNodes of the same class in two 
RelSubsets in the same RelSet (thus the equivalent line) is the necessary 
condition to trigger this bug. However, using convention in the unit test is 
not a must. It's merely to produce the two subSets. Any two traits that are 
different would suffice.

Also as in AssertOperandsDifferentRule, it is not trying to match any 
convention. We are simply trying to match a parent with its first and second 
children being PhysLeafRel.class, it is wrong to return a match with a parent 
relNode and both its second children.

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858053#comment-16858053
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 1:38 PM:
---

Sure, supposedly this rule should only match once, with left child operand == 
leftPhy (PHYS, label = a) and right child operand == rightPhy (PHYS_2, label == 
b). 

However, without the fix, when rightPhy gets generated, it triggers two match 
attempts. The first one is expected as above. In the second one, rightPhy 
(PHYS_2, label == b) matches the left child operand, then we attempt to match 
the PhyBiRel as parent operand. 

Supposedly this parent should not match, because rightPhy is PhyBiRel's second 
child, but rightPhy's matched left child operand is parent operand's first 
child. Because of the bug, it didn't skip as expected and still proceed with 
the matching. After parent is matched, then it attempt to match the right child 
operand, and again matched the rightPhy (PHYS_2, label == b), as expected 
because it is PhyBiRel's second child. Overall as a result, both child operand 
end up matching the same RelNode rightPhy (PHYS_2, label == b). 

To recap, AssertOperandsDifferentRule is trying to match a parent with its 
first and second children being PhysLeafRel.class, it is wrong to return a 
match with a parent relNode and both its second children.

Hope this is clearer now. 


was (Author: botong):
Sure, supposedly this rule should only match once, with left child operand == 
leftPhy (PHYS, label = a) and right child operand == rightPhy (PHYS_2, label == 
b). 

However, without the fix, when rightPhy gets generated, it triggers two match 
attempts. The first one is expected as above. In the second one, rightPhy 
(PHYS_2, label == b) matches the left child operand, then we attempt to match 
the PhyBiRel as parent operand. 

Supposedly this parent should not match, because rightPhy is PhyBiRel's second 
child, but rightPhy's matched left child operand is parent operand's first 
child. Because of the bug, it didn't skip as expected and still proceed with 
the matching. After parent is matched, then it attempt to match the right child 
operand, and again matched the rightPhy (PHYS_2, label == b), as expected 
because it is PhyBiRel's second child. Overall as a result, both child operand 
end up matching the same RelNode rightPhy (PHYS_2, label == b). 

Hope this is clearer now. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858053#comment-16858053
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 1:39 PM:
---

Sure, supposedly this rule should only match once, with left child operand == 
leftPhy (PHYS, label = a) and right child operand == rightPhy (PHYS_2, label == 
b). 

However, without the fix, when rightPhy gets generated, it triggers two match 
attempts. The first one is expected as above. In the second one, rightPhy 
(PHYS_2, label == b) matches the left child operand, then we attempt to match 
the PhyBiRel as parent operand. 

Supposedly this parent should not match, because rightPhy is PhyBiRel's second 
child, but rightPhy's matched left child operand is parent operand's first 
child. Because of the bug, it didn't skip as expected and still proceed with 
the matching. After parent is matched, then it attempt to match the right child 
operand, and again matched the rightPhy (PHYS_2, label == b), as expected 
because it is PhyBiRel's second child. Overall as a result, both child operand 
end up matching the same RelNode rightPhy (PHYS_2, label == b). 

To recap, AssertOperandsDifferentRule is trying to match a parent with its 
first and second children being PhysLeafRel.class, it is wrong to return a 
match with a parent relNode and its second children twice.

Hope this is clearer now. 


was (Author: botong):
Sure, supposedly this rule should only match once, with left child operand == 
leftPhy (PHYS, label = a) and right child operand == rightPhy (PHYS_2, label == 
b). 

However, without the fix, when rightPhy gets generated, it triggers two match 
attempts. The first one is expected as above. In the second one, rightPhy 
(PHYS_2, label == b) matches the left child operand, then we attempt to match 
the PhyBiRel as parent operand. 

Supposedly this parent should not match, because rightPhy is PhyBiRel's second 
child, but rightPhy's matched left child operand is parent operand's first 
child. Because of the bug, it didn't skip as expected and still proceed with 
the matching. After parent is matched, then it attempt to match the right child 
operand, and again matched the rightPhy (PHYS_2, label == b), as expected 
because it is PhyBiRel's second child. Overall as a result, both child operand 
end up matching the same RelNode rightPhy (PHYS_2, label == b). 

To recap, AssertOperandsDifferentRule is trying to match a parent with its 
first and second children being PhysLeafRel.class, it is wrong to return a 
match with a parent relNode and both its second children.

Hope this is clearer now. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858653#comment-16858653
 ] 

Botong Huang commented on CALCITE-3118:
---

Thanks [~danny0405] for reviewing! Having two relNodes of the same class in two 
RelSubsets in the same RelSet (thus the equivalent line) is the necessary 
condition to trigger this bug. However, using convention in the unit test is 
not a must. It's merely to produce the two subSets. Any two traits that are 
different would suffice.

Also as in AssertOperandsDifferentRule, it is not trying to match any 
convention. We are simply trying to match a parent with its first and second 
children being PhysLeafRel.class, it is wrong to return a match with a parent 
relNode and both its second children.

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-07 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858178#comment-16858178
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/7/19 1:27 PM:
---

Agree in general. But in this test case, the rule in this unit test 
(AssertOperandsDifferentRule) is to match a parent and its first and second 
child, it is wrong to return a match with a parent relNode and both its second 
child. 


was (Author: botong):
Agree in general. But in this test case, the rule in this unit test 
(AssertOperandsDifferentRule) is to match a parent and its first and second 
child, it is wrong to return a match with a relNode and both its second child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858178#comment-16858178
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/6/19 11:29 PM:


Agree in general. But in this test case, the rule in this unit test 
(AssertOperandsDifferentRule) is to match a parent and its first and second 
child, it is wrong to return a match with a relNode and both its second child. 


was (Author: botong):
Agree in general. But in this test case, the rule (AssertOperandsDifferentRule) 
is to match a parent and its first and second child, it is wrong to return a 
match with a relNode and both its second child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858178#comment-16858178
 ] 

Botong Huang commented on CALCITE-3118:
---

Agree in general. But in this test case, the rule (AssertOperandsDifferentRule) 
is to match a parent and its first and second child, it is wrong to return a 
match with a relNode and both its second child. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858053#comment-16858053
 ] 

Botong Huang commented on CALCITE-3118:
---

Sure, supposedly this rule should only match once, with left child operand == 
leftPhy (PHYS, label = a) and right child operand == rightPhy (PHYS_2, label == 
b). 

However, without the fix, when rightPhy gets generated, it triggers two match 
attempts. The first one is expected as above. In the second one, rightPhy 
(PHYS_2, label == b) matches the left child operand, then we attempt to match 
the PhyBiRel as parent operand. 

Supposedly this parent should not match, because rightPhy is PhyBiRel's second 
child, but rightPhy's matched left child operand is parent operand's first 
child. Because of the bug, it didn't skip as expected and still proceed with 
the matching. After parent is matched, then it attempt to match the right child 
operand, and again matched the rightPhy (PHYS_2, label == b), as expected 
because it is PhyBiRel's second child. Overall as a result, both child operand 
end up matching the same RelNode rightPhy (PHYS_2, label == b). 

Hope this is clearer now. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858038#comment-16858038
 ] 

Botong Huang commented on CALCITE-3118:
---

New version pushed, can you take another look and see if it is better now? 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858020#comment-16858020
 ] 

Botong Huang commented on CALCITE-3118:
---

Sure, let me try make the unit test more readable. On a side note, is unit test 
required for this kind of fix? 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857997#comment-16857997
 ] 

Botong Huang edited comment on CALCITE-3118 at 6/6/19 7:08 PM:
---

Thanks [~vladimirsitnikov] for reviewing. The real check in the unit test is 
the assert inside TwoChildrenSameSetMatchRule. Without the one line fix, this 
assert will fail because it ends up matching the same PhysLeafRel for both 
child operands. I can add some comments in the unit test in needed.


was (Author: botong):
Thanks [~vladimirsitnikov] for reviewing. The real check in the unit test is 
the assert inside TwoChildrenSameSetMatchRule. Without the one line fix, this 
assert will fail because it ends up matching the same PhysLeafRel as both child 
operand. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/CALCITE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857997#comment-16857997
 ] 

Botong Huang commented on CALCITE-3118:
---

Thanks [~vladimirsitnikov] for reviewing. The real check in the unit test is 
the assert inside TwoChildrenSameSetMatchRule. Without the one line fix, this 
assert will fail because it ends up matching the same PhysLeafRel as both child 
operand. 

> VolcanoRuleCall match parent child ordinal not properly checked
> ---
>
> Key: CALCITE-3118
> URL: https://issues.apache.org/jira/browse/CALCITE-3118
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
>  Labels: pull-request-available
>
> In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
> looking for parent operand match), we want to check that the matched parent 
> indeed has the previously matched child RelNode as a child with the expected 
> ordinal. However, there is a bug in this check. As a result, some incorrect 
> parent is not skipped as expected and matched incorrectly. See unit test 
> included in PR for a case that triggers this bug, where the same RelNode get 
> matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CALCITE-3118) VolcanoRuleCall match parent child ordinal not properly checked

2019-06-06 Thread Botong Huang (JIRA)

Botong Huang created CALCITE-3118:
-

 Summary: VolcanoRuleCall match parent child ordinal not properly 
checked
 Key: CALCITE-3118
 URL: https://issues.apache.org/jira/browse/CALCITE-3118
 Project: Calcite
  Issue Type: Bug
Reporter: Botong Huang


In VolcanoRuleCall.matchRecurse(), when ascending (child operand is matched, 
looking for parent operand match), we want to check that the matched parent 
indeed has the previously matched child RelNode as a child with the expected 
ordinal. However, there is a bug in this check. As a result, some incorrect 
parent is not skipped as expected and matched incorrectly. See unit test 
included in PR for a case that triggers this bug, where the same RelNode get 
matched for two operands at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

86 matches

Mail list logo