[jira] [Created] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-20 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4017:
--

 Summary: Implement trait propagation for Enumerable Setop 
 Key: CALCITE-4017
 URL: https://issues.apache.org/jira/browse/CALCITE-4017
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


Mainly for Union operator, not sure about Minus and Intersect. I haven't check 
how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4018) Implement trait propagation for EnumerableValues

2020-05-20 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4018:
--

 Summary:  Implement trait propagation for EnumerableValues
 Key: CALCITE-4018
 URL: https://issues.apache.org/jira/browse/CALCITE-4018
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


Only passThrough is needed.
Currently, when Values is created, it will enumerate all the possible 
collations no matter parent operator requires it or not, it will be a disaster 
if the Values has thousands of columns, and the parent operator may be just a 
hash aggregate or hashjoin, which doesn't care about its collation.
The collation should be created on demand by calling passThrough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-20 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112802#comment-17112802
 ] 

Haisheng Yuan commented on CALCITE-4017:


Yes. SortUnionTransposeRule will be useless after this JIRA is done. Basically, 
sort operator should be prohibited to participate any rule transformation.

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4018) Implement trait propagation for EnumerableValues

2020-05-20 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112803#comment-17112803
 ] 

Haisheng Yuan commented on CALCITE-4018:


Yes, here it is:
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableValues.java#L62
The main logic is in RelMdCollation.values(mq, rowType, tuples)

>  Implement trait propagation for EnumerableValues
> -
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4018) Implement trait propagation for EnumerableValues

2020-05-20 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112804#comment-17112804
 ] 

Haisheng Yuan commented on CALCITE-4018:


Not all, but N, produce at most N collations (where N is the number of columns).

>  Implement trait propagation for EnumerableValues
> -
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-21 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113578#comment-17113578
 ] 

Haisheng Yuan commented on CALCITE-4017:


Thanks for confirming. I just noticed that EnumerableUnion doesn't have the 
option to preserve ordering. Then we don't need to implement the trait 
propagation for Union, until EnumerableUnion provide this mechanism. Let's 
leave it open for now.

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-21 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113579#comment-17113579
 ] 

Haisheng Yuan commented on CALCITE-4017:


Btw, if you check testSortUnionTranspose2, it is generating worse plan 
alternative. The top sort should be eliminated.

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-21 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4018:
---
Summary: EnumerableValues should provide requested traits  (was:  Implement 
trait propagation for EnumerableValues)

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4018) Implement trait propagation for EnumerableValues

2020-05-21 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4018:
---
Description: 
Only passThrough is needed.
Currently, when Values is created, it will enumerate all the possible 
collations no matter parent operator requires it or not, it will be a disaster 
if the Values has thousands of columns, and the parent operator may be just a 
hash aggregate or hashjoin, which doesn't care about its collation.
The collation should be created on demand by calling passThrough.

e.g.

{code:java}
SELECT * from (values
(1, 1),
(2, 1),
(1, 2),
(2, 2)
) as t(a, b)
order by b, a
{code}

Currently Calcite will generate plan:

{code:java}
EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
  EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
{code}

But after this JIRA, I am expecting a plan without Sort.



  was:
Only passThrough is needed.
Currently, when Values is created, it will enumerate all the possible 
collations no matter parent operator requires it or not, it will be a disaster 
if the Values has thousands of columns, and the parent operator may be just a 
hash aggregate or hashjoin, which doesn't care about its collation.
The collation should be created on demand by calling passThrough.


>  Implement trait propagation for EnumerableValues
> -
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-21 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113608#comment-17113608
 ] 

Haisheng Yuan commented on CALCITE-4018:


Thanks, I have updated. Even though Values may not be worth optimizing in 
practice, it will be a good example to demonstrate how useful it is to do trait 
propagation on leaf nodes. The problem in CALCITE-2624 can be easily solved by 
overriding EnumerableTableScan.passThrough(), return an ElasticScan or 
IndexScan if there is an index that can satisfy the required collation.

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-21 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113608#comment-17113608
 ] 

Haisheng Yuan edited comment on CALCITE-4018 at 5/21/20, 11:05 PM:
---

Thanks, I have updated. Even though Values may not be worth optimizing in 
practice, it will be a good example to demonstrate how useful it is to do trait 
propagation on leaf nodes. The problem in CALCITE-2624 and CALCITE-3854 can be 
easily solved by overriding EnumerableTableScan.passThrough(), return an 
ElasticScan or IndexScan if there is an index that can satisfy the required 
collation.


was (Author: hyuan):
Thanks, I have updated. Even though Values may not be worth optimizing in 
practice, it will be a good example to demonstrate how useful it is to do trait 
propagation on leaf nodes. The problem in CALCITE-2624 can be easily solved by 
overriding EnumerableTableScan.passThrough(), return an ElasticScan or 
IndexScan if there is an index that can satisfy the required collation.

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4011) Implement trait propagation for EnumerableProject and EnumerableFilter

2020-05-21 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113761#comment-17113761
 ] 

Haisheng Yuan commented on CALCITE-4011:


Hi [~amaliujia], thanks! I think we need trait derivation, for both project and 
filter. MergeJoin can be an example.

> Implement trait propagation for EnumerableProject and EnumerableFilter
> --
>
> Key: CALCITE-4011
> URL: https://issues.apache.org/jira/browse/CALCITE-4011
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> So that parent trait can be passed down, and it can derive new traitsets from 
> child. Most importantly, as a demonstration. So that SortProjectTransposeRule 
> and ProjectSortTransposeRule can be disabled when topDownOpt is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-784) LogicalAggregate's create method discards any collation traits from input

2020-05-22 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-784.
---
Resolution: Invalid

Closing it, since it should be done on physical operators, not logical 
operators.

> LogicalAggregate's create method discards any collation traits from input
> -
>
> Key: CALCITE-784
> URL: https://issues.apache.org/jira/browse/CALCITE-784
> Project: Calcite
>  Issue Type: Bug
>  Components: core, stream
>Reporter: Milinda Pathirage
>Priority: Major
> Attachments: CALCITE-784-0.patch
>
>
> LogicalAggregate's create method gets the trait set of Convention.NONE from 
> input's cluster, but doesn't use any trait information from the input. But to 
> have proper collation trait set we need to consider input's collation trait 
> set when inferring LogicalAggregate's traits.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-22 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114302#comment-17114302
 ] 

Haisheng Yuan commented on CALCITE-3221:


Looks like CALCITE-4017 has a dependency on this JIRA.
I would not suggest adding a new physical operator {{EnumerableMergeUnion}}. We 
can extends {{EnumerableUnion}} by adding a field {{collation}}, to indicate 
whether to preserve ordering and what is the order. Default is EMPTY, which 
means no ordering needs to be preserved. Because if the parent operator doesn't 
require any collation, it is always good to not preserve order. If it is 
required by parent operator, we can pass down the required collation to UNION's 
children on demand, and returns its new collation, by overriding 
{{EnumerableUnion#passThroughTraits()}}.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-22 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114314#comment-17114314
 ] 

Haisheng Yuan commented on CALCITE-4017:


[~amaliujia] I think we can still move forward. We just want the plan, not the 
execution. 

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114904#comment-17114904
 ] 

Haisheng Yuan commented on CALCITE-4017:


Thanks a lot, [~amaliujia].

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3989) Release Calcite 1.23.0

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3989.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/b708fdc46d4c5fd4c5a6c7a398823318a7b4dce3.

> Release Calcite 1.23.0
> --
>
> Key: CALCITE-3989
> URL: https://issues.apache.org/jira/browse/CALCITE-3989
> Project: Calcite
>  Issue Type: Task
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4011) Implement trait propagation for EnumerableProject and EnumerableFilter

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4011:
---
Fix Version/s: 1.24.0

> Implement trait propagation for EnumerableProject and EnumerableFilter
> --
>
> Key: CALCITE-4011
> URL: https://issues.apache.org/jira/browse/CALCITE-4011
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> So that parent trait can be passed down, and it can derive new traitsets from 
> child. Most importantly, as a demonstration. So that SortProjectTransposeRule 
> and ProjectSortTransposeRule can be disabled when topDownOpt is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-2354) How to add existing user-defined functions in Schema

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-2354.

Resolution: Invalid

> How to add existing user-defined functions in Schema
> 
>
> Key: CALCITE-2354
> URL: https://issues.apache.org/jira/browse/CALCITE-2354
> Project: Calcite
>  Issue Type: Bug
>  Components: jdbc-adapter
>Reporter: Subbarao
>Priority: Critical
>
> Iam already having some of the functions in oracle database i.e :user defined 
> functions like FN_CODE("columnname").Then how can i add these type of already 
> existing functions in Apache calcite schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-2997) Avoid pushing down join condition in SqlToRelConverter

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-2997:
---
Fix Version/s: 1.24.0

> Avoid pushing down join condition in SqlToRelConverter
> --
>
> Key: CALCITE-2997
> URL: https://issues.apache.org/jira/browse/CALCITE-2997
> Project: Calcite
>  Issue Type: Bug
>Reporter: Jin Xing
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.24.0
>
>
> In current code, *SqlToRelConverter:createJoin* is calling 
> *RelOptUtil.pushDownJoinConditions* for optimization. And we can find below 
> conversion from *SqlNode* to *RelNode*:
> {code:java}
> SqlNode:
> select * from A join B on A.x = B.x * 2
> RelNode (Logical-Plan):
> Join (condition:col0=col1)
> |-Project(x as col0)
> | |-Scan(A)
> |-Project(x * 2 as col1)
>   |-Scan(B){code}
> As we can see the logical plan(*RelNode*) posted above is not the pure 
> reflection of the original SQL String(*SqlNode*). The optimization is mixed 
> into the phase on which AST is converted to Logical-Plan. Actually optimizing 
> rule of JoinPushExpressionsRule is doing exactly the same kind of thing. 
> Shall we just keep the optimization inside Optimized-Logical-Plan ? I mean 
> shall we avoid calling *RelOptUtil.pushDownJoinConditions* in 
> *SqlToRelConverter:createJoin*
> I raised this issue because that we are doing something based on the 
> Logical-Plan. And it makes us really confused that the Logical-Plan doesn't 
> corresponds to SqlNode. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4016) Implement trait propagation for EnumerableCalc

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4016:
---
Fix Version/s: 1.24.0

> Implement trait propagation for EnumerableCalc
> --
>
> Key: CALCITE-4016
> URL: https://issues.apache.org/jira/browse/CALCITE-4016
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Chunwei Lei
>Priority: Major
> Fix For: 1.24.0
>
>
> It should be similar with EnumerableProject. Maybe same logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4018:
---
Fix Version/s: 1.24.0

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4012) Implement trait propagation for EnumerableHashJoin and EnumerableNestedLoopJoin

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4012:
---
Fix Version/s: 1.24.0

> Implement trait propagation for EnumerableHashJoin and 
> EnumerableNestedLoopJoin
> ---
>
> Key: CALCITE-4012
> URL: https://issues.apache.org/jira/browse/CALCITE-4012
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>
> The traitset can be derived from the outer relation of 
> hashjoin/nestedloopjoin.
> They can also pass down parent trait request to their outer child.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3950:
---
Fix Version/s: 1.24.0

> Doc of SqlGroupingFunction contradicts its behavior
> ---
>
> Key: CALCITE-3950
> URL: https://issues.apache.org/jira/browse/CALCITE-3950
> Project: Calcite
>  Issue Type: Bug
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently doc of SqlGroupingFunctions says:
> {code:java}
> /**
>  * The {@code GROUPING} function.
>  *
>  * Accepts 1 or more arguments.
>  * Example: {@code GROUPING(deptno, gender)} returns
>  * 3 if both deptno and gender are being grouped,
>  * 2 if only deptno is being grouped,
>  * 1 if only gender is being groped,
>  * 0 if neither deptno nor gender are being grouped.{code}
> But its behavior in agg.iq is as below:
> {code:java}
> # GROUPING in SELECT clause of CUBE query
> select deptno, job, count(*) as c, grouping(deptno) as d,
>   grouping(job) j, grouping(deptno, job) as x
> from "scott".emp
> group by cube(deptno, job);
> ++---++---+---+---+
> | DEPTNO | JOB   | C  | D | J | X |
> ++---++---+---+---+
> | 10 | CLERK |  1 | 0 | 0 | 0 |
> | 10 | MANAGER   |  1 | 0 | 0 | 0 |
> | 10 | PRESIDENT |  1 | 0 | 0 | 0 |
> | 10 |   |  3 | 0 | 1 | 1 |
> | 20 | ANALYST   |  2 | 0 | 0 | 0 |
> | 20 | CLERK |  2 | 0 | 0 | 0 |
> | 20 | MANAGER   |  1 | 0 | 0 | 0 |
> | 20 |   |  5 | 0 | 1 | 1 |
> | 30 | CLERK |  1 | 0 | 0 | 0 |
> | 30 | MANAGER   |  1 | 0 | 0 | 0 |
> | 30 | SALESMAN  |  4 | 0 | 0 | 0 |
> | 30 |   |  6 | 0 | 1 | 1 |
> || ANALYST   |  2 | 1 | 0 | 2 |
> || CLERK |  4 | 1 | 0 | 2 |
> || MANAGER   |  3 | 1 | 0 | 2 |
> || PRESIDENT |  1 | 1 | 0 | 2 |
> || SALESMAN  |  4 | 1 | 0 | 2 |
> ||   | 14 | 1 | 1 | 3 |
> ++---++---+---+---+
> (18 rows)
> {code}
>  
> The doc needs to be rectified thus to be consistent with query result and the 
> behavior of Hive[1] and PostgreSQL[2]
>  [1] 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction]
>  [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4009) Revert traitset mapping that was added to ProjectJoinTransposeRule

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-4009:
---
Fix Version/s: 1.24.0

> Revert traitset mapping that was added to ProjectJoinTransposeRule
> --
>
> Key: CALCITE-4009
> URL: https://issues.apache.org/jira/browse/CALCITE-4009
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Revert traitset mapping that was added to ProjectJoinTransposeRule by 
> CALCITE-3353. Now it becomes obsolete, we should fail fast if that happens. 
> Otherwise, all the downstream projects that uses this rule will be wasted 
> time to deal with traitsets they don't need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3478) Restructure tests for materialized views

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3478:
---
Fix Version/s: 1.24.0

> Restructure tests for materialized views
> 
>
> Key: CALCITE-3478
> URL: https://issues.apache.org/jira/browse/CALCITE-3478
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> h2. *Motivation*
> Currently there are two strategies for materialized view matching:
> *strategy-1*. Substitution based (SubstitutionVisitor.java) [1]
>  *strategy-2*. Plan structural information based 
> (AbstractMaterializedViewRule.java) [2]
>  The two strategies are controlled by a single connection config of 
> "materializationsEnabled". Calcite will apply strategy-1 firstly and then 
> strategy-2.
> The two strategies are tested in a single integration test called 
> MaterializationTest.java,
>  As a result we cannot run tests separately for a single strategy, which 
> leads to:
>  # When some new matching patterns are supported by strategy-1, we might need 
> to update the old result plan, which was previously matched and generated by 
> stragegy-2, e.g. [3], and corresponding testing pattern for stragegy-2 will 
> be lost.
>  # Some test failures are even hidden, e.g. 
> MaterializationTest#testJoinMaterialization2 should but failed to be 
> supported by stragegy-2. However strategy-1 lets the test passed.
>  # Hard to test internals for SubstutionVisitor.java, e.g. [4] has to 
> struggle and create a unit test
> Of course we can add more system config or connection config just for testing 
> and circle around some of the dilemmas I mentioned above. But it will make 
> the code messy. Materialized view matching strategies are so important and 
> worth a through unit test and to be kept clean.
> Additionally, this JIRA targets to clean the code of 
> MaterializationTest.java. As more and more fixes get applied, this Java file 
> tends to be messy:
>  # Helping methods and testing methods are mixed without good order.
>  # Lots of methods called checkMaterialize. We need to sort it out if there's 
> need to add more params, e.g. [5]
>  # Some tests are not concise enough, e.g. testJoinMaterialization9 
> h2. *Approach*
> 1. Create unit test MaterializedViewSubstitutionVisitorTest to test strategy-1
>  2. Create unit test MaterializedViewRelOptRulesTest to test strategy-2
>  3. Move tests from MaterializationTest to unit tests correspondingly, and 
> keep MaterializationTest for integration tests.
>  
> [1] 
> [https://calcite.apache.org/docs/materialized_views.html#substitution-via-rules-transformation]
>  [2] 
> [https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information]
>  [3] 
> [https://github.com/apache/calcite/pull/1451/files#diff-d7e9e44fcb5fb1b98198415a3f78f167R1831]
>  [4] [https://github.com/apache/calcite/pull/1555]
>  [5] [https://github.com/apache/calcite/pull/1504]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3782) Bitwise operator Bit_And, Bit_OR and Bit_XOR support binary and varbinary type

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3782:
---
Fix Version/s: 1.24.0

> Bitwise operator Bit_And, Bit_OR and Bit_XOR support binary and varbinary type
> --
>
> Key: CALCITE-3782
> URL: https://issues.apache.org/jira/browse/CALCITE-3782
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.22.0
>Reporter: hailong wang
>Assignee: hailong wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> According to the discussion  link CALCITE-3732 , We should make bitwise 
> operators work on all integer types, BINARY and VARBINARY. So Bit_And, Bit_OR 
> and Bit_XOR agg operator should also support BINARY and VARBINARY.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3841) Change downloads page to use downloads.apache.org

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3841:
---
Fix Version/s: 1.24.0

> Change downloads page to use downloads.apache.org
> -
>
> Key: CALCITE-3841
> URL: https://issues.apache.org/jira/browse/CALCITE-3841
> Project: Calcite
>  Issue Type: Bug
>  Components: site
>Reporter: Julian Hyde
>Priority: Major
> Fix For: 1.24.0
>
>
> Infra has 
> [decided|https://lists.apache.org/thread.html/rcd2864e75e417597d342b8eb83080eb2d7a0cafea84fd4521a4d9cfd%40%3Cusers.infra.apache.org%3E]
>  (login required for that email link) to deprecate 
> [www.apache.org/dist|https://www.apache.org/dist] and move downloads to 
> [https://downloads.apache.org|https://downloads.apache.org].
> On [Calcite's downloads page|https://calcite.apache.org/downloads/], we need 
> to change the 'digest' link from (for example) 
> {{https://www.apache.org/dist/calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz.sha256}}
>  to 
> {{https://downloads.apache.org/calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz.sha256}},
>  and similarly the 'gpg' link.
> I believe that the 'tar' link can remain as 
> {{https://www.apache.org/dyn/closer.lua?filename=calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz&action=download}}
>  for the latest release and 
> {{https://archive.apache.org/dist/calcite/apache-calcite-1.20.0/apache-calcite-1.20.0-src.tar.gz}}
>  for older releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3981:
---
Fix Version/s: 1.24.0

> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the registering subset itself, register should always 
> use canonize() to find and return the equivalent active subset instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-2157) ClickHouse dialect implementation

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-2157.

Fix Version/s: 1.23.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/39e58566c1ac02824d99ae9260d3315539efd57e.
 Thanks for the PR, [~chris-baynes]!

> ClickHouse dialect implementation
> -
>
> Key: CALCITE-2157
> URL: https://issues.apache.org/jira/browse/CALCITE-2157
> Project: Calcite
>  Issue Type: New Feature
>  Components: jdbc-adapter
>Reporter: Chris Baynes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> ClickHouse is a really fast columnar DBMS for OLAP: 
> [https://clickhouse.yandex/.|https://clickhouse.yandex/]
> It has a jdbc adapter and uses mostly standard sql, though there are 
> differences (e.g. join syntax, datatypes, function name case-sensitivity).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CALCITE-2157) ClickHouse dialect implementation

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan closed CALCITE-2157.
--

Resolved in release 1.23.0 (2020-05-24).

> ClickHouse dialect implementation
> -
>
> Key: CALCITE-2157
> URL: https://issues.apache.org/jira/browse/CALCITE-2157
> Project: Calcite
>  Issue Type: New Feature
>  Components: jdbc-adapter
>Reporter: Chris Baynes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> ClickHouse is a really fast columnar DBMS for OLAP: 
> [https://clickhouse.yandex/.|https://clickhouse.yandex/]
> It has a jdbc adapter and uses mostly standard sql, though there are 
> differences (e.g. join syntax, datatypes, function name case-sensitivity).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3366) RelDecorrelator supports Union

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3366:
---
Fix Version/s: 1.24.0

> RelDecorrelator supports Union
> --
>
> Key: CALCITE-3366
> URL: https://issues.apache.org/jira/browse/CALCITE-3366
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Jin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> This issue proposes to support decorrelation for below sql
> {code:java}
> SELECT deptno FROM dept where exists
> (SELECT 1 FROM emp where sal < 100 and emp.deptno=dept.deptno
> union all
> SELECT 1 FROM emp where sal > 200 and emp.deptno=dept.deptno){code}
> This issue was found when I resolve CALCITE-3363 in 
> https://github.com/apache/calcite/pull/1466 
> I failed to construct an semi-join operator from SQL string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3916) Support cascades style top-down driven rule apply

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3916:
---
Fix Version/s: 1.24.0

> Support cascades style top-down driven rule apply
> -
>
> Key: CALCITE-3916
> URL: https://issues.apache.org/jira/browse/CALCITE-3916
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Jinpeng Wu
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Apply rules by leaf RelSet -> root RelSet order. For every RelNode in a 
> RelSet, rule is matched and applied sequentially. No RuleQueue and 
> DeferringRuleCall is needed anymore. This will make space pruning and rule 
> mutual exclusivity check possible.
> Rule that use AbstractConverter as operand is an exception, to keep backward 
> compatibility, this kind of rule still needs top-down apply.
> This should be done after CALCITE-3896.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3517) DiffRepository spends too much time writing XML, makes some tests 5x slower

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3517:
---
Fix Version/s: 1.24.0

> DiffRepository spends too much time writing XML, makes some tests 5x slower
> ---
>
> Key: CALCITE-3517
> URL: https://issues.apache.org/jira/browse/CALCITE-3517
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tests that use {{DiffRepository}} are spending far too much effort writing 
> XML, even if the XML matches the reference file. For example, If I comment 
> out [a call to set(tag, 
> next)|https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/test/java/org/apache/calcite/test/DiffRepository.java#L267],
>  {{RelOptRulesTest}} improves from 32s to 6s; {{SqlToRelConverterTest}} 
> improves from 24s to 4.7s; {{SqlPrettyWriterTest}} remains .8s.
> The {{DiffRepository.expand}} method is the cause of the inefficiency. It 
> causes the entire XML document to be re-generated and written to disk. This 
> is not just slow but quadratic - if a test has N cases, each test writes the 
> XML document, an effort proportional to N.
> {{DiffRepository}} should remain conservative. If one of the tests fails, and 
> a later test crashes, the output from the failed test should have been 
> written out. It is acceptable if the test remains slow if there are test 
> failures.
> {{DiffRepository}} is only used in tests; this bug does not affect production 
> code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3910) Enhance ProjectJoinTransposeRule to support SemiJoin and AntiJoin

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3910:
---
Fix Version/s: 1.24.0

> Enhance ProjectJoinTransposeRule to support SemiJoin and AntiJoin
> -
>
> Key: CALCITE-3910
> URL: https://issues.apache.org/jira/browse/CALCITE-3910
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Chunwei Lei
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, ProjectJoinTransposeRule does not support push project pass 
> SemiJoin and AntiJoin.
> {code:java}
> public void onMatch(RelOptRuleCall call) {
>   Project origProj = call.rel(0);
>   final Join join = call.rel(1);
>   if (!join.getJoinType().projectsRight()) {
> return; // TODO: support SemiJoin / AntiJoin
>   }
>   ...
>   ...
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan closed CALCITE-3660.
--
Fix Version/s: 1.23.0
   Resolution: Fixed

Resolved in release 1.23.0 (2020-05-24).

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
> Fix For: 1.23.0
>
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   a

[jira] [Reopened] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan reopened CALCITE-3660:


> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
> Fix For: 1.23.0
>
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java

[jira] [Resolved] (CALCITE-3759) Class memory leak due to code generation

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3759.

Resolution: Invalid

> Class memory leak due to code generation
> 
>
> Key: CALCITE-3759
> URL: https://issues.apache.org/jira/browse/CALCITE-3759
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Mike Villa
>Priority: Major
> Attachments: image-2020-01-28-15-55-43-215.png
>
>
> Hi, I'm using calcite and I'm making unit test to see the perform, but with 
> visualvm or jconsole I have checked a class leak. Maybe It's my fault.
> I would be grateful if someone helped me to find the error!
> I have created a GitHub project to check this error.
>  https://github.com/mvillafuertem/calcite-error.git
>  
> !image-2020-01-28-15-55-43-215.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3660.

Resolution: Won't Fix

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
> Fix For: 1.23.0
>
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.r

[jira] [Commented] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115328#comment-17115328
 ] 

Haisheng Yuan commented on CALCITE-3660:


Won't fix

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
> Fix For: 1.23.0
>
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.uti

[jira] [Updated] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3660:
---
Fix Version/s: (was: 1.23.0)

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j

[jira] [Issue Comment Deleted] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3660:
---
Comment: was deleted

(was: Resolved in release 1.23.0 (2020-05-24).)

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExe

[jira] [Commented] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115337#comment-17115337
 ] 

Haisheng Yuan commented on CALCITE-3660:


The issue has been there for 3 years. 

https://issues.apache.org/jira/browse/CALCITE-1561?focusedCommentId=15901638&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15901638

and similarly
https://issues.apache.org/jira/browse/CALCITE-1751

I don't expect anyone would take a look. But feel free to reopen it if you 
think it will be fixed in another 3 years. :)

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$Ma

[jira] [Commented] (CALCITE-3674) EnumerableMergeJoinRule fails with NPE on nullable join keys

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115342#comment-17115342
 ] 

Haisheng Yuan commented on CALCITE-3674:


[~vladimirsitnikov], does the following commit solve this issue?
https://github.com/apache/calcite/commit/bfde14be1284efc6d4560868fef3724238c35dc3

> EnumerableMergeJoinRule fails with NPE on nullable join keys
> 
>
> Key: CALCITE-3674
> URL: https://issues.apache.org/jira/browse/CALCITE-3674
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Sample exception:
> {noformat}
> Caused by: java.lang.NullPointerException
> at java.lang.Short.compareTo(Short.java:445)
> at java.lang.Short.compareTo(Short.java:43)
> at 
> org.apache.calcite.linq4j.EnumerableDefaults$MergeJoinEnumerator.advance(EnumerableDefaults.java:3866)
> at 
> org.apache.calcite.linq4j.EnumerableDefaults$MergeJoinEnumerator.moveNext(EnumerableDefaults.java:3918)
> at 
> org.apache.calcite.linq4j.EnumerableDefaults.aggregate(EnumerableDefaults.java:118)
> at 
> org.apache.calcite.linq4j.DefaultEnumerable.aggregate(DefaultEnumerable.java:104)
> at Baz.bind(Unknown Source)
> at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:355)
> at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:315){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115354#comment-17115354
 ] 

Haisheng Yuan commented on CALCITE-3660:


Thanks for the explanation, which makes a lot sense.

> PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 
> 1066: Unable to open iterator for alias t
> -
>
> Key: CALCITE-3660
> URL: https://issues.apache.org/jira/browse/CALCITE-3660
> Project: Calcite
>  Issue Type: Bug
>  Components: pig-adapter
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Sample:
> https://github.com/vlsi/calcite/runs/369966426#step:5:1116
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > testImplWithJoin() FAILED
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias t
> at org.apache.pig.PigServer.openIterator(PigServer.java:1019)
> at org.apache.pig.pigunit.PigTest.getAliasFromCache(PigTest.java:224)
> at org.apache.pig.pigunit.PigTest.getActualResults(PigTest.java:319)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
> at 
> org.apache.pig.pigunit.PigTest.assertOutputAnyOrder(PigTest.java:371)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.assertScriptAndResults(PigRelBuilderStyleTest.java:263)
> at 
> org.apache.calcite.test.PigRelBuilderStyleTest.testImplWithJoin(PigRelBuilderStyleTest.java:181)
> Caused by:
> java.io.IOException: Job terminated with anomalous status FAILED
> at org.apache.pig.PigServer.openIterator(PigServer.java:1011)
> ... 6 more
> {noformat}
> There's exception as well:
> {noformat}
> org.apache.calcite.test.PigRelBuilderStyleTest > 
> testImplWithGroupByCountDistinct() STANDARD_OUT
> 2020-01-01 15:18:31,596 [LocalJobRunner Map Task Executor #0] WARN  - 
> SchemaTupleBackend has already been initialized
> STORE t INTO 'myoutput';
> --> none
> 2020-01-01 15:18:31,734 [ForkJoinPool-1-worker-3] WARN  - 
> SchemaTupleBackend has already been initialized
> 2020-01-01 15:18:31,758 [Thread-47] WARN  - job_local79466183_0002
> java.lang.Exception: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Local Rearrange[tuple]{chararray}(false) - scope-24 
> Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing (Name: t: Local Rearrange[tuple]{chararray}(false) 
> - scope-24 Operator Key: scope-24): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: t: Filter[bag] - scope-8 Operator Key: scope-8): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [t[-1,-1]]
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.ba

[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115376#comment-17115376
 ] 

Haisheng Yuan commented on CALCITE-3221:


Ah, apparently, I was wrong. I was think of it using Greenplum or MaxCompute's 
mindset, which doesn't apply to Enumerable operators.

In both products, UNION is rewritten to Aggregate (distinct) on top of UNOIN 
ALL.
There is only PhysicalUnionAll operator, but no PhysicalUnion(where all is 
false) operator. If the parent operator requests a collation, it can generate 
an alternative with the required collation, at the same time requires its 
children to satisfy the same collation. It doesn't need to have additional 
field to indicate order preserving at all, if the PhysicalUnionAll's collation 
is not empty, that means it needs to do sorted merge operation.

Separating distinct operation has some benefits. The distinct can be hash based 
or sort based, UNION ALL doesn't need to worry about it, just merge the data 
(with or without order). Because aggregate and sort are among the most complex 
operators (window is the most complex one), we don't want to duplicate the 
complex logic in UNION operator. In addition, it can utilize the existing 
multi-stage StreamAgg / HashAgg operators, apparently we don't want to create a 
2-stage UNION operator.

And this is SQL Server's execution plan:
 !screenshot-1.png! 

But since we don't need to worry all that stuff for EnumerableConvention, which 
is non-MPP and in-memory, and the existing one is already hash based, so it 
totally makes sense to add another physical operator to do sort based union.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3221:
---
Attachment: screenshot-1.png

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115390#comment-17115390
 ] 

Haisheng Yuan commented on CALCITE-3221:


If we can adopt the same strategy, some downstream projects may benefit from 
it, e.g. HerdDB

https://github.com/diennea/herddb/blob/ae2dc79b193d98e3b53bf43af2ba5018328b85b0/herddb-core/src/main/java/herddb/sql/CalcitePlanner.java#L1150

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115416#comment-17115416
 ] 

Haisheng Yuan commented on CALCITE-4018:


OK, no problem.

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4018) EnumerableValues should provide requested traits

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan reassigned CALCITE-4018:
--

Assignee: Haisheng Yuan

> EnumerableValues should provide requested traits
> 
>
> Key: CALCITE-4018
> URL: https://issues.apache.org/jira/browse/CALCITE-4018
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Only passThrough is needed.
> Currently, when Values is created, it will enumerate all the possible 
> collations no matter parent operator requires it or not, it will be a 
> disaster if the Values has thousands of columns, and the parent operator may 
> be just a hash aggregate or hashjoin, which doesn't care about its collation.
> The collation should be created on demand by calling passThrough.
> e.g.
> {code:java}
> SELECT * from (values
> (1, 1),
> (2, 1),
> (1, 2),
> (2, 2)
> ) as t(a, b)
> order by b, a
> {code}
> Currently Calcite will generate plan:
> {code:java}
> EnumerableSort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
>   EnumerableValues(tuples=[[{ 1, 1 }, { 2, 1 }, { 1, 2 }, { 2, 2 }]])
> {code}
> But after this JIRA, I am expecting a plan without Sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115423#comment-17115423
 ] 

Haisheng Yuan commented on CALCITE-3221:


No, I don't think combining concatenation and distinct into a single UNION 
operator is a good practice (in optimizer level), in the long term.

If we create 2 physical operators (EnumerableUnion, EnumerableMergeUnion), then
- EnumerableUnion needs to take care of concatenation (all=true) and hash-based 
distinct(all=false) logic. This is current what we have.
- EnumerableMergeUnion needs take care of concatenation (all=true, 
merge=false), sorted-merge concatenation (all=true, merge=true), sorted-merge 
distinct (all=false, merge=true).

If we go this way, I don't think downstream projects can just reuse the plan 
generated by Calcite.

If we only keep physical UNION ALL, then we only need 
concatenation(merge=false), concatenation(merge=true). 

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-3841) Change downloads page to use downloads.apache.org

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan reassigned CALCITE-3841:
--

Assignee: Haisheng Yuan

> Change downloads page to use downloads.apache.org
> -
>
> Key: CALCITE-3841
> URL: https://issues.apache.org/jira/browse/CALCITE-3841
> Project: Calcite
>  Issue Type: Bug
>  Components: site
>Reporter: Julian Hyde
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Infra has 
> [decided|https://lists.apache.org/thread.html/rcd2864e75e417597d342b8eb83080eb2d7a0cafea84fd4521a4d9cfd%40%3Cusers.infra.apache.org%3E]
>  (login required for that email link) to deprecate 
> [www.apache.org/dist|https://www.apache.org/dist] and move downloads to 
> [https://downloads.apache.org|https://downloads.apache.org].
> On [Calcite's downloads page|https://calcite.apache.org/downloads/], we need 
> to change the 'digest' link from (for example) 
> {{https://www.apache.org/dist/calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz.sha256}}
>  to 
> {{https://downloads.apache.org/calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz.sha256}},
>  and similarly the 'gpg' link.
> I believe that the 'tar' link can remain as 
> {{https://www.apache.org/dyn/closer.lua?filename=calcite/apache-calcite-1.21.0/apache-calcite-1.21.0-src.tar.gz&action=download}}
>  for the latest release and 
> {{https://archive.apache.org/dist/calcite/apache-calcite-1.20.0/apache-calcite-1.20.0-src.tar.gz}}
>  for older releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4009) Revert traitset mapping that was added to ProjectJoinTransposeRule

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan reassigned CALCITE-4009:
--

Assignee: Haisheng Yuan

> Revert traitset mapping that was added to ProjectJoinTransposeRule
> --
>
> Key: CALCITE-4009
> URL: https://issues.apache.org/jira/browse/CALCITE-4009
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>
> Revert traitset mapping that was added to ProjectJoinTransposeRule by 
> CALCITE-3353. Now it becomes obsolete, we should fail fast if that happens. 
> Otherwise, all the downstream projects that uses this rule will be wasted 
> time to deal with traitsets they don't need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2020-05-24 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115437#comment-17115437
 ] 

Haisheng Yuan commented on CALCITE-3221:


I think HerdDB is not using it, this rule is not in default rule set 
{{Programs.RULE_SET}}. 
I am OK with either way. 
Who knows maybe in next few years, we (or some other contributors) want to 
extend Enumerable to support distributed plan. At that time, 
{{EnumerableUnion(all=false)}} will be dropped.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3985) Simplify grouped window function in parser

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3985.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/2ba551905c4ae99be2c68e0c75301d9287d7b61a,
 thanks for the PR, [~amaliujia]!

> Simplify grouped window function in parser
> --
>
> Key: CALCITE-3985
> URL: https://issues.apache.org/jira/browse/CALCITE-3985
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently in parser, there is [1]:
> {code:java}
> SqlCall GroupByWindowingCall():
> {
> final Span s;
> final List args;
> final SqlOperator op;
> }
> {
> (
> 
> {
> s = span();
> op = SqlStdOperatorTable.TUMBLE_OLD;
> }
> |
> 
> {
> s = span();
> op = SqlStdOperatorTable.HOP_OLD;
> }
> |
> 
> {
> s = span();
> op = SqlStdOperatorTable.SESSION_OLD;
> }
> )
> args = UnquantifiedFunctionParameterList(ExprContext.ACCEPT_SUB_QUERY) {
> return op.createCall(s.end(this), args);
> }
> }
> {code}
> The s=span() are duplicates and there could be a way to keep only one 
> s=span().
> [1]: 
> https://github.com/apache/calcite/blob/master/core/src/main/codegen/templates/Parser.jj#L6049



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3988) Intersect in RelMdRowCount doesn't take into account 'intersect all'

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3988.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/7952cd550a7fac127a6cd7db44fd70c9d1e16d50,
 thanks for the PR, [~xzh_dz]!

> Intersect in RelMdRowCount doesn't take into account 'intersect all' 
> -
>
> Key: CALCITE-3988
> URL: https://issues.apache.org/jira/browse/CALCITE-3988
> Project: Calcite
>  Issue Type: Wish
>Reporter: xzh_dz
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Same as 
> [https://issues.apache.org/jira/browse/CALCITE-3287|https://issues.apache.org/jira/browse/CALCITE-3287].
> Intersect in RelMdRowCount.java doesn't take into account 'intersect all' .
> {code:java}
> public Double getRowCount(Intersect rel, RelMetadataQuery mq) {
> Double rowCount = null;
> for (RelNode input : rel.getInputs()) {
>   Double partialRowCount = mq.getRowCount(input);
>   if (rowCount == null
>   || partialRowCount != null && partialRowCount < rowCount) {
> rowCount = partialRowCount;
>   }
> }
> return rowCount;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3910) Enhance ProjectJoinTransposeRule to support SemiJoin and AntiJoin

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3910.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/39cb3e3619a528bb39a677598f16a26f10afbfaf,
 thanks for the PR, [~fan_li_ya]!

> Enhance ProjectJoinTransposeRule to support SemiJoin and AntiJoin
> -
>
> Key: CALCITE-3910
> URL: https://issues.apache.org/jira/browse/CALCITE-3910
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Chunwei Lei
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, ProjectJoinTransposeRule does not support push project pass 
> SemiJoin and AntiJoin.
> {code:java}
> public void onMatch(RelOptRuleCall call) {
>   Project origProj = call.rel(0);
>   final Join join = call.rel(1);
>   if (!join.getJoinType().projectsRight()) {
> return; // TODO: support SemiJoin / AntiJoin
>   }
>   ...
>   ...
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3999) Simplify DialectPool implementation using Guava cache

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3999.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/258f791eff12b4bac0fc11b4025aa5d99dcbbed1,
 thanks for the PR, [~jcamachorodriguez]!

> Simplify DialectPool implementation using Guava cache
> -
>
> Key: CALCITE-3999
> URL: https://issues.apache.org/jira/browse/CALCITE-3999
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> JdbcUtils contains a pool to cache SqlDialect objects. Currently, it relies 
> on multiple maps and a synchronized {{get}} method. Although I am not very 
> familiar with that code, it seems the implementation could be made simpler 
> and more efficient by using a Guava cache. In addition, since we would not 
> have a single synchronized get method, multiple threads could concurrently 
> create dialects for distinct data sources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3478) Restructure tests for materialized views

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3478.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/4fbb930491283f2e68b91f69597958c2d33b2b18,
 thanks for the PR, [~jinxing6...@126.com]!

> Restructure tests for materialized views
> 
>
> Key: CALCITE-3478
> URL: https://issues.apache.org/jira/browse/CALCITE-3478
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> h2. *Motivation*
> Currently there are two strategies for materialized view matching:
> *strategy-1*. Substitution based (SubstitutionVisitor.java) [1]
>  *strategy-2*. Plan structural information based 
> (AbstractMaterializedViewRule.java) [2]
>  The two strategies are controlled by a single connection config of 
> "materializationsEnabled". Calcite will apply strategy-1 firstly and then 
> strategy-2.
> The two strategies are tested in a single integration test called 
> MaterializationTest.java,
>  As a result we cannot run tests separately for a single strategy, which 
> leads to:
>  # When some new matching patterns are supported by strategy-1, we might need 
> to update the old result plan, which was previously matched and generated by 
> stragegy-2, e.g. [3], and corresponding testing pattern for stragegy-2 will 
> be lost.
>  # Some test failures are even hidden, e.g. 
> MaterializationTest#testJoinMaterialization2 should but failed to be 
> supported by stragegy-2. However strategy-1 lets the test passed.
>  # Hard to test internals for SubstutionVisitor.java, e.g. [4] has to 
> struggle and create a unit test
> Of course we can add more system config or connection config just for testing 
> and circle around some of the dilemmas I mentioned above. But it will make 
> the code messy. Materialized view matching strategies are so important and 
> worth a through unit test and to be kept clean.
> Additionally, this JIRA targets to clean the code of 
> MaterializationTest.java. As more and more fixes get applied, this Java file 
> tends to be messy:
>  # Helping methods and testing methods are mixed without good order.
>  # Lots of methods called checkMaterialize. We need to sort it out if there's 
> need to add more params, e.g. [5]
>  # Some tests are not concise enough, e.g. testJoinMaterialization9 
> h2. *Approach*
> 1. Create unit test MaterializedViewSubstitutionVisitorTest to test strategy-1
>  2. Create unit test MaterializedViewRelOptRulesTest to test strategy-2
>  3. Move tests from MaterializationTest to unit tests correspondingly, and 
> keep MaterializationTest for integration tests.
>  
> [1] 
> [https://calcite.apache.org/docs/materialized_views.html#substitution-via-rules-transformation]
>  [2] 
> [https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information]
>  [3] 
> [https://github.com/apache/calcite/pull/1451/files#diff-d7e9e44fcb5fb1b98198415a3f78f167R1831]
>  [4] [https://github.com/apache/calcite/pull/1555]
>  [5] [https://github.com/apache/calcite/pull/1504]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3950.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/b9a2075ae125c318b6909b95d7ff0192fcec9149.

> Doc of SqlGroupingFunction contradicts its behavior
> ---
>
> Key: CALCITE-3950
> URL: https://issues.apache.org/jira/browse/CALCITE-3950
> Project: Calcite
>  Issue Type: Bug
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently doc of SqlGroupingFunctions says:
> {code:java}
> /**
>  * The {@code GROUPING} function.
>  *
>  * Accepts 1 or more arguments.
>  * Example: {@code GROUPING(deptno, gender)} returns
>  * 3 if both deptno and gender are being grouped,
>  * 2 if only deptno is being grouped,
>  * 1 if only gender is being groped,
>  * 0 if neither deptno nor gender are being grouped.{code}
> But its behavior in agg.iq is as below:
> {code:java}
> # GROUPING in SELECT clause of CUBE query
> select deptno, job, count(*) as c, grouping(deptno) as d,
>   grouping(job) j, grouping(deptno, job) as x
> from "scott".emp
> group by cube(deptno, job);
> ++---++---+---+---+
> | DEPTNO | JOB   | C  | D | J | X |
> ++---++---+---+---+
> | 10 | CLERK |  1 | 0 | 0 | 0 |
> | 10 | MANAGER   |  1 | 0 | 0 | 0 |
> | 10 | PRESIDENT |  1 | 0 | 0 | 0 |
> | 10 |   |  3 | 0 | 1 | 1 |
> | 20 | ANALYST   |  2 | 0 | 0 | 0 |
> | 20 | CLERK |  2 | 0 | 0 | 0 |
> | 20 | MANAGER   |  1 | 0 | 0 | 0 |
> | 20 |   |  5 | 0 | 1 | 1 |
> | 30 | CLERK |  1 | 0 | 0 | 0 |
> | 30 | MANAGER   |  1 | 0 | 0 | 0 |
> | 30 | SALESMAN  |  4 | 0 | 0 | 0 |
> | 30 |   |  6 | 0 | 1 | 1 |
> || ANALYST   |  2 | 1 | 0 | 2 |
> || CLERK |  4 | 1 | 0 | 2 |
> || MANAGER   |  3 | 1 | 0 | 2 |
> || PRESIDENT |  1 | 1 | 0 | 2 |
> || SALESMAN  |  4 | 1 | 0 | 2 |
> ||   | 14 | 1 | 1 | 3 |
> ++---++---+---+---+
> (18 rows)
> {code}
>  
> The doc needs to be rectified thus to be consistent with query result and the 
> behavior of Hive[1] and PostgreSQL[2]
>  [1] 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction]
>  [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3761) How to write a rule with optional intermediate operands?

2020-05-24 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3761.

Resolution: Invalid

Hi [~anjalishrishrimal], you can send your question to dev mailing list. People 
will help you there. 

> How to write a rule with optional intermediate operands?
> 
>
> Key: CALCITE-3761
> URL: https://issues.apache.org/jira/browse/CALCITE-3761
> Project: Calcite
>  Issue Type: Wish
>  Components: core
>Reporter: anjali shrishrimal
>Priority: Trivial
>
> I want to write a rule to match a plan based on, only root/top RelNode and 
> leaf RelNode, all Intermediate RelNodes are optional.
> What operands should be passed to such rule?
>  
> Suppose Logical Plan is like given below.
> {code:java}
> LogicalRelNode4
>  LogicalRelNode3 (optional)
>          LogicalRelNode2 (optional)
>   LogicalRelNode1
> {code}
> LogicalRelNode2 and LogicalRelNode3 are optional. Rule should match the 
> structure irrespective to the presence of these optional Nodes.
> Rule should get matched for all the following structures.
> {code:java}
> 1. LogicalRelNode4
> LogicalRelNode3
>  LogicalRelNode2
>   LogicalRelNode1 
> 2. LogicalRelNode4 
> LogicalRelNode2 
>  LogicalRelNode1
> 3. LogicalRelNode4 
> LogicalRelNode3 
>  LogicalRelNode1 
> 4. LogicalRelNode4 
> LogicalRelNode1
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4023) Remove or deprecate ProjectSortTransposeRule

2020-05-26 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4023:
--

 Summary: Remove or deprecate ProjectSortTransposeRule
 Key: CALCITE-4023
 URL: https://issues.apache.org/jira/browse/CALCITE-4023
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


It never worked. The check condition {{if (sort.getClass() != Sort.class)}} is 
always true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4024) In top-down optimizer, forbid Sort (non-limit) to participate any rule matches

2020-05-26 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4024:
--

 Summary: In top-down optimizer, forbid Sort (non-limit) to 
participate any rule matches
 Key: CALCITE-4024
 URL: https://issues.apache.org/jira/browse/CALCITE-4024
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


In top-down optimizer, forbid enforcer operator, e.g. Sort (non-limit) to 
participate any rule matches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-26 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116911#comment-17116911
 ] 

Haisheng Yuan commented on CALCITE-3972:


The fact that Sort can participate rule matching is the culprit.
Sort changes the relation's physical property, but doesn't change the logical 
property.
Use 
[testSortJoinTranspose2|https://github.com/apache/calcite/commit/0715f5b55f363a58e3dd8c20caac0024e19be413#diff-de15ea9da479ca31d38de70365967392R4070]
 as example,

{code:java}
Before:
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
  LogicalSort(sort0=[$10], dir0=[ASC])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalProject(DEPTNO=[$0], NAME=[$1])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

After:
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
  LogicalSort(sort0=[$10], dir0=[ASC])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalSort(sort0=[$1], dir0=[ASC])
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

If we combine the sort with any operator in the original plan, the logical 
properties are all the same. After the rule execution, LogicalJoin has a new 
right input, even the sort in the right input can be the same RelSet as 
LogicalProject (unfortunately it isn't), the new join right input changed (the 
logical join is requesting a collation on right input), the join's digest 
changed, it will be viewed as a whole new join, then will apply all the logical 
transformations that it can apply.

Although the rule above only applies on outer join, the same problem happens on 
SortProjectTransposeRule.

Now come back to the problem in JDBCTest.testJoinManyWays(), 
JoinPushThroughJoinRule's one rule operand is RelNode.class, which means any 
new node in the join's input RelSet will trigger the rule. But in this rule, we 
don't care about what exact relnode it is, we just want the whole group as a 
placeholder. Any new logical sort, physical sort, and abstract converter will 
all trigger the matches of JoinPushThroughJoinRule. This is extremely 
unnecessary.

If we change

{code:java}
operand(RelNode.class, any())),
{code}
to

{code:java}
operandJ(RelNode.class, null, n -> !n.isEnforcer(), any())),
{code}

It will achieve the same effect as generating EnumerableSort directly, but 
still generating LogicalSort in RelCollationTraitDef, without affecting rules 
like, SortProjectTranspose, SortJoinTranspose, SortJoinCopy.

The total rule apply count of JoinPushThroughJoinRule cut from 9000 to 900, 
reduced by 90%. This will again reduce the ProjectMergeRule a lot, because 
every join reorder generate at least a new LogicalProject in Calcite. 

Now the rule count is:

{code:java}
Rules   
Attempts   Time (us)
ProjectMergeRule:force_mode   
14,064   2,680,177
EnumerableProjectRule(in:NONE,out:ENUMERABLE)
974 271,608
JoinPushThroughJoinRule:left 
449 209,768
JoinPushThroughJoinRule:right
449   2,949
AggregatePullUpConstantsRule 
291  17,947
AggregateProjectMergeRule
277  83,288
ProjectFilterTransposeRule   
207  30,300
EnumerableJoinRule(in:NONE,out:ENUMERABLE)   
108  70,179
EnumerableMergeJoinRule(in:NONE,out:ENUMERABLE)  
108  46,111
JoinPushExpressionsRule  
108  10,807
{code}





> Allow RelBuilder to create RelNode with convention and use it for trait 
> convert
> ---
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing 
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the 
> underlying

[jira] [Commented] (CALCITE-4023) Remove or deprecate ProjectSortTransposeRule

2020-05-26 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116914#comment-17116914
 ] 

Haisheng Yuan commented on CALCITE-4023:


The rule is useless. 

> Remove or deprecate ProjectSortTransposeRule
> 
>
> Key: CALCITE-4023
> URL: https://issues.apache.org/jira/browse/CALCITE-4023
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> It never worked. The check condition {{if (sort.getClass() != Sort.class)}} 
> is always true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-26 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116960#comment-17116960
 ] 

Haisheng Yuan commented on CALCITE-3972:


Agree with [~rkondakov]. My take is more radical, {{LogicalSort}} should not 
exist at all. Enforcers should only exist in physical world, not logical world. 

> Allow RelBuilder to create RelNode with convention and use it for trait 
> convert
> ---
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing 
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the 
> underlying RelFactories and create a new builder. 
> 3. Use the new interface in RelCollationTraitDef for converting into 
> RelCollation traits
> We can avoid ~1/3 of total rule firings in a N way join case with this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3989) Release Calcite 1.23.0

2020-05-26 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116963#comment-17116963
 ] 

Haisheng Yuan commented on CALCITE-3989:


[~mamo], I have created a ticket for this:
https://issues.apache.org/jira/browse/INFRA-20326

> Release Calcite 1.23.0
> --
>
> Key: CALCITE-3989
> URL: https://issues.apache.org/jira/browse/CALCITE-3989
> Project: Calcite
>  Issue Type: Task
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3989) Release Calcite 1.23.0

2020-05-26 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116964#comment-17116964
 ] 

Haisheng Yuan commented on CALCITE-3989:


See mailing list discussion:
https://lists.apache.org/thread.html/r1b0cf72ff0235feb2b8c734d7132bff13a8bea83184006f01904da9a%40%3Cdev.calcite.apache.org%3E

> Release Calcite 1.23.0
> --
>
> Key: CALCITE-3989
> URL: https://issues.apache.org/jira/browse/CALCITE-3989
> Project: Calcite
>  Issue Type: Task
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4009) Revert traitset mapping that was added to ProjectJoinTransposeRule

2020-05-26 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4009.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/2b1254bdec18f9e869f3287fb8ab471903e97829.

> Revert traitset mapping that was added to ProjectJoinTransposeRule
> --
>
> Key: CALCITE-4009
> URL: https://issues.apache.org/jira/browse/CALCITE-4009
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Revert traitset mapping that was added to ProjectJoinTransposeRule by 
> CALCITE-3353. Now it becomes obsolete, we should fail fast if that happens. 
> Otherwise, all the downstream projects that uses this rule will be wasted 
> time to deal with traitsets they don't need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4004) Override Object.toString() in RelOptRuleOperand

2020-05-27 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4004.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/b166b9a0c4a4420adf2d50933e8b32726566a7ff.

> Override Object.toString() in RelOptRuleOperand
> ---
>
> Key: CALCITE-4004
> URL: https://issues.apache.org/jira/browse/CALCITE-4004
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Override Object.toString() in RelOptRuleOperand to facilitate debugging, 
> otherwise, it is so tedious...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3997) Problem with MERGE JOIN: java.lang.AssertionError: cannot merge join: left input is not sorted on left keys

2020-05-27 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118013#comment-17118013
 ] 

Haisheng Yuan commented on CALCITE-3997:


Sorry for the mis-operation, now I know the real file name to search for 
result. I will take care next time.

> Problem with MERGE JOIN: java.lang.AssertionError: cannot merge join: left 
> input is not sorted on left keys
> ---
>
> Key: CALCITE-3997
> URL: https://issues.apache.org/jira/browse/CALCITE-3997
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Enrico Olivelli
>Priority: Blocker
> Fix For: 1.23.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I have a couple of problems with HerdDB.
> 1) JOIN order unsorted columns in presence of a WHERE over other columns
> This is my case:
> CREATE TABLE tblspace1.table1 (k1 string primary key,n1 int,s1 string)
> CREATE TABLE tblspace1.table3 (k1 string primary key,n3 int,s3 string)
> SELECT t1.k1 as first, t2.k1 as second
> FROMtblspace1.table1 t1 
>  INNER JOIN tblspace1.table3 t2 ON t1.k1=t2.k1
>  WHERE t1.n1 + 1 = t2.n3
> In this case for table1 and table3 no column is physically sorted (no column 
> with a collation)  
> I have this Planner error:
> java.lang.AssertionError: cannot merge join: left input is not sorted on left 
> keys
> at 
> org.apache.calcite.rel.metadata.RelMdCollation.mergeJoin(RelMdCollation.java:457)
> at 
> org.apache.calcite.rel.metadata.RelMdCollation.collations(RelMdCollation.java:153)
> at GeneratedMetadataHandler_Collation.collations_$(Unknown Source)
> at GeneratedMetadataHandler_Collation.collations(Unknown Source)
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.collations(RelMetadataQuery.java:539)
> at 
> org.apache.calcite.rel.metadata.RelMdCollation.project(RelMdCollation.java:273)
> at 
> org.apache.calcite.rel.logical.LogicalProject.lambda$create$0(LogicalProject.java:122)
> at org.apache.calcite.plan.RelTraitSet.replaceIfs(RelTraitSet.java:242)
> at 
> org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:121)
> at 
> org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:111)
> at 
> org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.createProject(RelFactories.java:172)
> at org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:1464)
> at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1258)
> at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1230)
> at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1219)
> at 
> org.apache.calcite.plan.RelOptUtil.pushDownJoinConditions(RelOptUtil.java:3620)
> at 
> org.apache.calcite.rel.rules.JoinPushExpressionsRule.onMatch(JoinPushExpressionsRule.java:59)
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:221)
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:519)
> at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:535)
> at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:292) 
> If I remove the "WHERE" clause then no error is reported.
> we have many  other test cases about JOINs and this one is the only one that 
> fails
> This is the failing test case on HerdDB
> https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/core/SimpleJoinTest.java#L522
> We are using the default set of rules Programs.ofRules(Programs.RULE_SET)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4027) Add -Doverwrite option to SqlToRelTestBase

2020-05-27 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4027:
--

 Summary: Add -Doverwrite option to SqlToRelTestBase
 Key: CALCITE-4027
 URL: https://issues.apache.org/jira/browse/CALCITE-4027
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


So that by setting overwrite=true, the expected xml file will be overwritten by 
actual output file automatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-27 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3972.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/ada6cc4309dd79e09abcbd67570b786d80340018.
 Thanks [~xndai] for the PR.
Thanks all for the discussion and code review!

> Allow RelBuilder to create RelNode with convention and use it for trait 
> convert
> ---
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing 
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the 
> underlying RelFactories and create a new builder. 
> 3. Use the new interface in RelCollationTraitDef for converting into 
> RelCollation traits
> We can avoid ~1/3 of total rule firings in a N way join case with this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-05-28 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118780#comment-17118780
 ] 

Haisheng Yuan commented on CALCITE-3939:


Thanks for reporting the issue, [~anha]. 
[~botong] Can you take a look?
I am wondering why CustomScan doesn't trigger RuleX after set merge.
Does RuleX look into the siblings of Project in the RelSet through RelSubset?

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3939) Change UnionEliminatorRule and ProjectRemoveRule to auto pruning SubstitutionRule

2020-05-28 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119117#comment-17119117
 ] 

Haisheng Yuan commented on CALCITE-3939:


[~anha] Thanks for the detailed description. Can you create a new issue? So 
that we can continue discussion there. Thanks.

> Change UnionEliminatorRule and ProjectRemoveRule to auto pruning 
> SubstitutionRule
> -
>
> Key: CALCITE-3939
> URL: https://issues.apache.org/jira/browse/CALCITE-3939
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> UnionEliminatorRule and ProjectRemoveRule are both pruning rules for a 
> RelNode. They can also become SubstitutionRule with autoprune enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4011) Implement trait propagation for EnumerableProject and EnumerableFilter

2020-05-29 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4011.

Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/0af3fd17a293d37125c7cca58257e5f6cbc1a76c,
 thanks for the PR, [~amaliujia]!

> Implement trait propagation for EnumerableProject and EnumerableFilter
> --
>
> Key: CALCITE-4011
> URL: https://issues.apache.org/jira/browse/CALCITE-4011
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> So that parent trait can be passed down, and it can derive new traitsets from 
> child. Most importantly, as a demonstration. So that SortProjectTransposeRule 
> and ProjectSortTransposeRule can be disabled when topDownOpt is enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4017) Implement trait propagation for Enumerable Setop

2020-05-29 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119877#comment-17119877
 ] 

Haisheng Yuan commented on CALCITE-4017:


Needs some consensus on CALCITE-3221 before working on this issue.

> Implement trait propagation for Enumerable Setop 
> -
>
> Key: CALCITE-4017
> URL: https://issues.apache.org/jira/browse/CALCITE-4017
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Mainly for Union operator, not sure about Minus and Intersect. I haven't 
> check how is Enumerable Minus, Intersect's executor implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4012) Implement trait propagation for EnumerableHashJoin and EnumerableNestedLoopJoin

2020-05-29 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119884#comment-17119884
 ] 

Haisheng Yuan commented on CALCITE-4012:


Cool.

> Implement trait propagation for EnumerableHashJoin and 
> EnumerableNestedLoopJoin
> ---
>
> Key: CALCITE-4012
> URL: https://issues.apache.org/jira/browse/CALCITE-4012
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>
> The traitset can be derived from the outer relation of 
> hashjoin/nestedloopjoin.
> They can also pass down parent trait request to their outer child.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4032) Mark CalcMergeRule as TransformationRule

2020-05-29 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-4032:
--

 Summary: Mark CalcMergeRule as TransformationRule
 Key: CALCITE-4032
 URL: https://issues.apache.org/jira/browse/CALCITE-4032
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


Previously it was removed in CALCITE-3997.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3993) Add isDefault(), keys(), keyBits() to RelTrait interface

2020-05-29 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3993.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/af976e9d4caa9588db8e413a58468760f3c1c0d6].

> Add isDefault(), keys(), keyBits() to RelTrait interface
> 
>
> Key: CALCITE-3993
> URL: https://issues.apache.org/jira/browse/CALCITE-3993
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It is an extremely frequent usage to check whether the distribution is 
> default (ANY), the collation is default (EMPTY).
> Also add the following to RelCollation
> {code:java}
> ImmutableIntList getKeys();
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3963) Maintain logical properties at RelSet (equivalent group) instead of RelNode

2020-05-30 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120385#comment-17120385
 ] 

Haisheng Yuan commented on CALCITE-3963:


4 values in confidence level is indeed less, we can extend it to be integer, 
but I believe many people may just leave it as the default value. If we 
frequently change the rule execution order, and the alternatives have the same 
`stats confidence level`, in which case it may be non-deterministic, yet I 
think the same issue happens on current Calcite too.

Regarding the folding operations, I am not sure how do we fold the cardinality, 
histogram, most common values, most common frequencies, constraint properties. 
After constant reduction, the selectivity for OR expressions might become 
smaller, yet selectivity for AND expressions may become larger. Should we 
choose min or max operation for selectivity, cardinality?

Using folding operations, do we still need to compute the logical properties 
for all the MergeJoins, HashJoins, NestedLoop joins with different distribution 
policy that are all generated from a single LogicalJoin?

 

> Maintain logical properties at RelSet (equivalent group) instead of RelNode
> ---
>
> Key: CALCITE-3963
> URL: https://issues.apache.org/jira/browse/CALCITE-3963
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
>
> Currently the logical properties (such as row count, distinct row count, etc) 
> are maintained at RelNode level. This creates a number of meta data 
> consistency problems, e.g. CALCITE-1048, CALCITE-2166. 
> In theory, all RelNodes in a RelSet should share the same logical properties 
> per definition of relational equivalence. So it makes more sense to keep 
> logical properties at RelSet level, rather than the RelNode. And such 
> properties shouldn't change when new sub set is created or subset's best is 
> changed.
> Specifically I think below build in metadata should fall into the logical 
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4023) Remove or deprecate ProjectSortTransposeRule

2020-05-31 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4023.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/2fb963c139abc7f655e237c78157f2e4983c4709].

> Remove or deprecate ProjectSortTransposeRule
> 
>
> Key: CALCITE-4023
> URL: https://issues.apache.org/jira/browse/CALCITE-4023
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It never worked. The check condition {{if (sort.getClass() != Sort.class)}} 
> is always true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4032) Mark CalcMergeRule as TransformationRule

2020-05-31 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120597#comment-17120597
 ] 

Haisheng Yuan commented on CALCITE-4032:


[~rubenql] Do you mind making your own version of CalcMergeRule?

> Mark CalcMergeRule as TransformationRule
> 
>
> Key: CALCITE-4032
> URL: https://issues.apache.org/jira/browse/CALCITE-4032
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Previously it was removed in CALCITE-3997.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4036) Allow applying SemiJoinRule to join without aggregate below

2020-06-01 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121036#comment-17121036
 ] 

Haisheng Yuan commented on CALCITE-4036:


Hmm.. Because AggregateRemoveRule was marked as SubstitutionalRule, will 
removing the marker solve the issue?

> Allow applying SemiJoinRule to join without aggregate below
> ---
>
> Key: CALCITE-4036
> URL: https://issues.apache.org/jira/browse/CALCITE-4036
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>
> The current implementation of {{SemiJoinRule}} can be applied for rel nodes 
> where the right input of join is aggregate, but it theoretically can be 
> applied for the case when there is no aggregate, but right joint input 
> returns column which has only unique values. Column uniqueness may be checked 
> using {{BuiltInMetadata.ColumnUniqueness}} statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4036) Allow applying SemiJoinRule to join without aggregate below

2020-06-01 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121236#comment-17121236
 ] 

Haisheng Yuan commented on CALCITE-4036:


Even AggregateRemoveRule is applied before SemiJoinRule, but the SemijoinRule 
will generate a new aggregate, which should trigger AggregateRemoveRule again. 
There is something missing here.

> Allow applying SemiJoinRule to join without aggregate below
> ---
>
> Key: CALCITE-4036
> URL: https://issues.apache.org/jira/browse/CALCITE-4036
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>
> The current implementation of {{SemiJoinRule}} can be applied for rel nodes 
> where the right input of join is aggregate, but it theoretically can be 
> applied for the case when there is no aggregate, but right joint input 
> returns column which has only unique values. Column uniqueness may be checked 
> using {{BuiltInMetadata.ColumnUniqueness}} statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4029) ProjectRemoveRule auto pruning may prevent rules from running if mixed conventions are used in a logical plan

2020-06-02 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4029.

Resolution: Not A Problem

Thanks all for the discussion. Since there is a consensus, I am closing this 
JIRA. Feel free to reopen it if you change your mind.

> ProjectRemoveRule auto pruning may prevent rules from running if mixed 
> conventions are used in a logical plan 
> --
>
> Key: CALCITE-4029
> URL: https://issues.apache.org/jira/browse/CALCITE-4029
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Anton Haidai
>Priority: Minor
>
> Preconditions to reproduce the issue:
>  # Logical plan has mixed conventions (for example, a bottom node is a 
> TableScan in a final convention while other nodes are regular logical nodes 
> with NONE convention).
>  # There is a rule that expects a logical node with an input (like a rule 
> matching "operand(LogicalSort.class, operand(RelNode.class, any()))")
>  # A project over the scan is trivial (like SELECT * FROM ...)
> The issue is related to https://issues.apache.org/jira/browse/CALCITE-3939, 
> please see comments for a detailed debugging of a real-life reproducing case.
> h4. Example:
> Logical plan with a leaf nodes in a custom convention:
> {code:java}
> LogicalSort[NONE]
>  LogicalProject[NONE]
>   CustomScan[CUSTOM_CONVENTION]{code}
> A rule configured (RuleX) matches "operand(LogicalSort.class, 
> operand(RelNode.class, any()))".
> *Without ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial an merges it into a 
> single RelSet with CustomScan. 
> RuleX can run on top of this change as far as LogicalProject has a logical 
> node (LogicalProject in RelSubset[NONE]) as an input.
>  
> *With ProjectRemoveRule auto pruning*
> ProjectRemoveRule recognizes LogicalProject as trivial but removes it with 
> it's RelSet so the CustomScan is the only node in it's RelSet, 
> RelSubset[CUSTOM_CONVENTION].
> RuleX can't run on top of this change as far as LogicalProject has an empty 
> input RelSubset[NONE] of the RelSet with the CustomScan.
> h2. Possible workarounds
>  # Disable ProjectRemoveRule auto pruning.
>  # Use only logical nodes in a logical plan, for the example above: use 
> LogicalScan - >  CustomScanRule - > CustomScan instead of direct use of 
> CustomScan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4032) Mark CalcMergeRule as TransformationRule

2020-06-02 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4032.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/29f798fb6919f24d95776658b7e659af179a8b15].

 

Thanks a lot for your understanding, [~rubenql]!

> Mark CalcMergeRule as TransformationRule
> 
>
> Key: CALCITE-4032
> URL: https://issues.apache.org/jira/browse/CALCITE-4032
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Previously it was removed in CALCITE-3997.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4032) Mark CalcMergeRule as TransformationRule

2020-06-03 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125013#comment-17125013
 ] 

Haisheng Yuan commented on CALCITE-4032:


Cool, thanks for the update, [~rubenql].

> Mark CalcMergeRule as TransformationRule
> 
>
> Key: CALCITE-4032
> URL: https://issues.apache.org/jira/browse/CALCITE-4032
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Previously it was removed in CALCITE-3997.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4036) Allow applying SemiJoinRule to join without aggregate below

2020-06-03 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125227#comment-17125227
 ] 

Haisheng Yuan commented on CALCITE-4036:


More importantly, AggregateRemoveRule won't be triggered if there is new 
relnode in the child RelSet, which makes it order sensitive. I guess adding a 
child rule operand (RelNode.class) may also solve the issue.

> Allow applying SemiJoinRule to join without aggregate below
> ---
>
> Key: CALCITE-4036
> URL: https://issues.apache.org/jira/browse/CALCITE-4036
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>
> The current implementation of {{SemiJoinRule}} can be applied for rel nodes 
> where the right input of join is aggregate, but it theoretically can be 
> applied for the case when there is no aggregate, but right joint input 
> returns column which has only unique values. Column uniqueness may be checked 
> using {{BuiltInMetadata.ColumnUniqueness}} statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3981) Volcano.register should not return stale/merged subset

2020-06-03 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3981.

Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/df5f4470e4257e8e7057664d4af3af3f37b6559b],
 thanks for the PR, [~botong]!

> Volcano.register should not return stale/merged subset
> --
>
> Key: CALCITE-3981
> URL: https://issues.apache.org/jira/browse/CALCITE-3981
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When a subset is registered, registerImpl() and registerSubset() currently 
> simply returns the subset itself. The problem is that subset can become stale 
> when relSets get merged (for example in ensureRegistered() and 
> registerSubset() "merge(set, subset.set)"). As a result, a stale/merged 
> subset might be returned from registerImpl, and the newly registering subtree 
> might get registered recursively on top of the stale subset (see 
> AbstractRelNode.onRegister()). This is a leak because once a relSet/subset is 
> merged into others and becomes stale, it should not be used to connect new 
> relNodes. 
> With CALCITE-3755, subsets can now be directly matched by rules. This opens 
> another source of stale subset leak: (1) An active subset gets matched, the 
> RuleMatch gets queued in RuleQueue. (2) The subset becomes stale due to 
> relSet merge. (3) The rule match in (1) is popped from queue and fired. (4) 
> In OnMatch the rule gets the stale subset, builds new rels on top of it and 
> regsiter the new rels. In this case, the entire new rel subtree will be 
> registered on top of the stale subset as is.
> Rather than returning the registering subset itself, register should always 
> use canonize() to find and return the equivalent active subset instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3991) The required boolean should always be provided in RelSet.getOrCreateSubset()

2020-06-03 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3991.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/2f68352c6dc9d05c6cb71516dac1105c27722154],
 thanks for the PR, [~botong]!

> The required boolean should always be provided in RelSet.getOrCreateSubset()
> 
>
> Key: CALCITE-3991
> URL: https://issues.apache.org/jira/browse/CALCITE-3991
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.24.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> the required boolean should always be provided in RelSet.getOrCreateSubset(). 
> Deleting the old default as well as other related code cleanup



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4030) Assert error during top-down optimization with Project/Filter Traits passdown and derivation

2020-06-03 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125418#comment-17125418
 ] 

Haisheng Yuan commented on CALCITE-4030:


The assertion error has been fixed in CALCITE-3981.

> Assert error during top-down optimization with Project/Filter Traits passdown 
> and derivation 
> -
>
> Key: CALCITE-4030
> URL: https://issues.apache.org/jira/browse/CALCITE-4030
> Project: Calcite
>  Issue Type: Task
>Reporter: Rui Wang
>Priority: Major
>
> For example, with https://github.com/apache/calcite/pull/1985, and enable 
> top-down opt by set "calcite.planner.topdown.opt=true" in saffron.properties, 
> run test case SortRemoveRuleTest.removeSortOverEnumerableHashJoin, we can see:
> {code:java}
> java.lang.AssertionError
>   at 
> org.apache.calcite.plan.volcano.OptimizeTask$RelNodeOptTask.execute(OptimizeTask.java:232)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:553)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:327)
>   at 
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:362)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.transform(SortRemoveRuleTest.java:77)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.removeSortOverEnumerableHashJoin(SortRemoveRuleTest.java:102)
> {code}
> The short term workaround is to comment the assert at OptimizeTask.java:232.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4030) Assert error during top-down optimization with Project/Filter Traits passdown and derivation

2020-06-03 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125423#comment-17125423
 ] 

Haisheng Yuan commented on CALCITE-4030:


MaterializedViewRelOptRulesTest. testJoinMaterialization10()
MaterializedViewRelOptRulesTest. testJoinMaterialization12()

has some other errors. 

IndexOutOfBoundsException in

org.apache.calcite.plan.volcano.OptimizeTask$RelSubsetOptTask.propagateTraits(OptimizeTask.java:299)

> Assert error during top-down optimization with Project/Filter Traits passdown 
> and derivation 
> -
>
> Key: CALCITE-4030
> URL: https://issues.apache.org/jira/browse/CALCITE-4030
> Project: Calcite
>  Issue Type: Task
>Reporter: Rui Wang
>Priority: Major
>
> For example, with https://github.com/apache/calcite/pull/1985, and enable 
> top-down opt by set "calcite.planner.topdown.opt=true" in saffron.properties, 
> run test case SortRemoveRuleTest.removeSortOverEnumerableHashJoin, we can see:
> {code:java}
> java.lang.AssertionError
>   at 
> org.apache.calcite.plan.volcano.OptimizeTask$RelNodeOptTask.execute(OptimizeTask.java:232)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:553)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:327)
>   at 
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:362)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.transform(SortRemoveRuleTest.java:77)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.removeSortOverEnumerableHashJoin(SortRemoveRuleTest.java:102)
> {code}
> The short term workaround is to comment the assert at OptimizeTask.java:232.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4042) JoinCommuteRule must not match SEMI / ANTI join

2020-06-04 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125868#comment-17125868
 ] 

Haisheng Yuan commented on CALCITE-4042:


Is it same issue? CALCITE-3911

> JoinCommuteRule must not match SEMI / ANTI join
> ---
>
> Key: CALCITE-4042
> URL: https://issues.apache.org/jira/browse/CALCITE-4042
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> JoinCommuteRule must not match SEMI / ANTI join, because these types cannot 
> be swapped.
> This is a minor issue, because the "default" JoinCommuteRule.INSTANCE matches 
> only INNER joins. However, there is another version of the rule 
> (JoinCommuteRule.SWAP_OUTER, currently only used in a few unit tests), to 
> match also outer joins, which could potentially match a SEMI / ANTI join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3911) JoinCommuteRule may generate wrong plan for SEMI/ANTI join

2020-06-04 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3911.

Resolution: Duplicate

> JoinCommuteRule may generate wrong plan for SEMI/ANTI join
> --
>
> Key: CALCITE-3911
> URL: https://issues.apache.org/jira/browse/CALCITE-3911
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> JoinCommuteRule generates wrong plan for SEMI/ANTI join when swapOuter is 
> true. Semi / Anti joins are not swappable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4042) JoinCommuteRule must not match SEMI / ANTI join

2020-06-04 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125879#comment-17125879
 ] 

Haisheng Yuan commented on CALCITE-4042:


Never mind, I will close CALCITE-3911.

> JoinCommuteRule must not match SEMI / ANTI join
> ---
>
> Key: CALCITE-4042
> URL: https://issues.apache.org/jira/browse/CALCITE-4042
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> JoinCommuteRule must not match SEMI / ANTI join, because these types cannot 
> be swapped.
> This is a minor issue, because the "default" JoinCommuteRule.INSTANCE matches 
> only INNER joins. However, there is another version of the rule 
> (JoinCommuteRule.SWAP_OUTER, currently only used in a few unit tests), to 
> match also outer joins, which could potentially match a SEMI / ANTI join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3911) JoinCommuteRule may generate wrong plan for SEMI/ANTI join

2020-06-04 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125881#comment-17125881
 ] 

Haisheng Yuan commented on CALCITE-3911:


See CALCITE-4042.

> JoinCommuteRule may generate wrong plan for SEMI/ANTI join
> --
>
> Key: CALCITE-3911
> URL: https://issues.apache.org/jira/browse/CALCITE-3911
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> JoinCommuteRule generates wrong plan for SEMI/ANTI join when swapOuter is 
> true. Semi / Anti joins are not swappable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4042) JoinCommuteRule must not match SEMI / ANTI join

2020-06-04 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125889#comment-17125889
 ] 

Haisheng Yuan commented on CALCITE-4042:


No problem, will do.

> JoinCommuteRule must not match SEMI / ANTI join
> ---
>
> Key: CALCITE-4042
> URL: https://issues.apache.org/jira/browse/CALCITE-4042
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.23.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.24.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> JoinCommuteRule must not match SEMI / ANTI join, because these types cannot 
> be swapped.
> This is a minor issue, because the "default" JoinCommuteRule.INSTANCE matches 
> only INNER joins. However, there is another version of the rule 
> (JoinCommuteRule.SWAP_OUTER, currently only used in a few unit tests), to 
> match also outer joins, which could potentially match a SEMI / ANTI join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4041) Implement trait propagation for EnumerableCorrelate

2020-06-04 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125895#comment-17125895
 ] 

Haisheng Yuan commented on CALCITE-4041:


Thanks, looks great.

> Implement trait propagation for EnumerableCorrelate
> ---
>
> Key: CALCITE-4041
> URL: https://issues.apache.org/jira/browse/CALCITE-4041
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Implement trait propagation for EnumerableCorrelate.
> Note that Correlate only supports join types: LEFT / INNER / SEMI / ANTI.
> The current implementation of EnumerableCorrelate keeps the order from the 
> left-hand-side.
> Check CALCITE-4012 for reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4030) Assert error during top-down optimization with Project/Filter Traits passdown and derivation

2020-06-04 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-4030.

Fix Version/s: 1.24.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/feae6fbc328e3a7c87693951d1623f8b47ccea59].

> Assert error during top-down optimization with Project/Filter Traits passdown 
> and derivation 
> -
>
> Key: CALCITE-4030
> URL: https://issues.apache.org/jira/browse/CALCITE-4030
> Project: Calcite
>  Issue Type: Task
>Reporter: Rui Wang
>Priority: Major
> Fix For: 1.24.0
>
>
> For example, with https://github.com/apache/calcite/pull/1985, and enable 
> top-down opt by set "calcite.planner.topdown.opt=true" in saffron.properties, 
> run test case SortRemoveRuleTest.removeSortOverEnumerableHashJoin, we can see:
> {code:java}
> java.lang.AssertionError
>   at 
> org.apache.calcite.plan.volcano.OptimizeTask$RelNodeOptTask.execute(OptimizeTask.java:232)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:553)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:327)
>   at 
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:362)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.transform(SortRemoveRuleTest.java:77)
>   at 
> org.apache.calcite.rel.rules.SortRemoveRuleTest.removeSortOverEnumerableHashJoin(SortRemoveRuleTest.java:102)
> {code}
> The short term workaround is to comment the assert at OptimizeTask.java:232.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    5   6   7   8   9   10   11   12   >