[jira] [Resolved] (CALCITE-4479) "vFloat in (1.0, 2.0)" throws UnsupportedOperationException

2021-01-27 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved CALCITE-4479.
-
Resolution: Fixed

Fixed in 
[039fe49|https://github.com/apache/calcite/commit/039fe493e195416ee40c93b720f304ac9fc3c8c8]
 !

> "vFloat in (1.0, 2.0)" throws UnsupportedOperationException
> ---
>
> Key: CALCITE-4479
> URL: https://issues.apache.org/jira/browse/CALCITE-4479
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.26.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Check this test in {{RexBuilderTest}}:
> {code:java}
> @Test void testMakeIn() {
> final RelDataTypeFactory typeFactory =
> new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
> final RexBuilder rexBuilder = new RexBuilder(typeFactory);
> final RelDataType floatType = 
> typeFactory.createSqlType(SqlTypeName.FLOAT);
> RexNode left = rexBuilder.makeInputRef(floatType, 0);
> final RexNode literal1 = rexBuilder.makeLiteral(1.0f, floatType);
> final RexNode literal2 = rexBuilder.makeLiteral(2.0f, floatType);
> RexNode inCall = rexBuilder.makeIn(left, ImmutableList.of(literal1, 
> literal2));
> assertThat(inCall.getKind(), is(SqlKind.SEARCH));
>   }
> {code}
> The stacktrace is:
> {noformat}
> class org.apache.calcite.sql.type.SqlTypeName: FLOAT
> java.lang.UnsupportedOperationException: class 
> org.apache.calcite.sql.type.SqlTypeName: FLOAT
>   at org.apache.calcite.util.Util.needToImplement(Util.java:1085)
>   at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:726)
>   at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
>   at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
>   at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
>   at org.apache.calcite.rex.RexLiteral.toLiteral(RexLiteral.java:762)
>   at 
> org.apache.calcite.rex.RexLiteral.lambda$printSarg$4(RexLiteral.java:733)
>   at 
> org.apache.calcite.util.RangeSets$Printer.singleton(RangeSets.java:409)
>   at org.apache.calcite.util.RangeSets.forEach(RangeSets.java:249)
>   at org.apache.calcite.util.Sarg.lambda$printTo$0(Sarg.java:119)
>   at org.apache.calcite.linq4j.Ord.forEach(Ord.java:157)
>   at org.apache.calcite.util.Sarg.printTo(Sarg.java:115)
>   at org.apache.calcite.rex.RexLiteral.printSarg(RexLiteral.java:732)
>   at 
> org.apache.calcite.rex.RexLiteral.lambda$appendAsJava$1(RexLiteral.java:673)
>   at org.apache.calcite.util.Util.asStringBuilder(Util.java:2525)
>   at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:672)
>   at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
>   at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
>   at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
>   at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:990)
>   at 
> org.apache.calcite.rex.RexBuilder.makeSearchArgumentLiteral(RexBuilder.java:1085)
>   at org.apache.calcite.rex.RexBuilder.makeIn(RexBuilder.java:1335)
>   at 
> org.apache.calcite.rex.RexBuilderTest.testMakeIn(RexBuilderTest.java:621)
> {noformat}
> The root cause is that {{RexLiteral#strictTypeName}} has different type name 
> strategies with what {{RexBuilder.makeLiteral}} follows, the best way to fix 
> is to make the rules synced, but here i only give a simple fix because the 
> code path only used for Sarg digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273118#comment-17273118
 ] 

Julian Hyde commented on CALCITE-3221:
--

[~amaliujia], Yes, that sounds about right.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273114#comment-17273114
 ] 

Rui Wang commented on CALCITE-3221:
---

I am thinking the valid conversion for 

select * from (select x from r union all select x from s) order by x offset 100 
limit 10

is 

select * from (select x from r order by x  limit 110 union all select x from s 
order by x  limit 110) order by x offset 100 limit 10




> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273109#comment-17273109
 ] 

Julian Hyde edited comment on CALCITE-3221 at 1/27/21, 7:29 PM:


The description of the case and the commit message should mention Enumerable 
convention and not mention sort. It does not sort; it merges, assuming that the 
input is sorted. (Sort implies that this is a blocking operation, which this is 
not.)

I didn't have time to review the PR in detail, but can you please check the 
following:
* What is the complexity if there is a large number of inputs (say 1000)? (It's 
OK if the complexity isn't good, just say what it is.)
* Suppose you have both a limit and an offset, e.g. {{select * from (select x 
from r union all select x from s) order by x offset 100 limit 10}}. Do you push 
down the offset? I am not sure that that is valid.


was (Author: julianhyde):
The description of the case and the commit message should mention Enumerable 
convention and not mention sort. It does not sort; it merges, assuming that the 
input is sorted.

I didn't have time to review the PR in detail, but can you please check the 
following:
* What is the complexity if there is a large number of inputs (say 1000)? (It's 
OK if the complexity isn't good, just say what it is.)
* Suppose you have both a limit and an offset, e.g. {{select * from (select x 
from r union all select x from s) order by x offset 100 limit 10}}. Do you push 
down the offset? I am not sure that that is valid.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273109#comment-17273109
 ] 

Julian Hyde commented on CALCITE-3221:
--

The description of the case and the commit message should mention Enumerable 
convention and not mention sort. It does not sort; it merges, assuming that the 
input is sorted.

I didn't have time to review the PR in detail, but can you please check the 
following:
* What is the complexity if there is a large number of inputs (say 1000)? (It's 
OK if the complexity isn't good, just say what it is.)
* Suppose you have both a limit and an offset, e.g. {{select * from (select x 
from r union all select x from s) order by x offset 100 limit 10}}. Do you push 
down the offset? I am not sure that that is valid.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

2021-01-27 Thread Martin Raszyk (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272890#comment-17272890
 ] 

Martin Raszyk commented on CALCITE-4242:


Are there some updates on this issue?

> Wrong plan for nested NOT EXISTS subqueries
> ---
>
> Key: CALCITE-4242
> URL: https://issues.apache.org/jira/browse/CALCITE-4242
> Project: Calcite
>  Issue Type: Bug
>Reporter: Martin Raszyk
>Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
> SELECT z FROM R
> WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[=($0, $1)], joinType=[left])
>   LogicalTableScan(table=[[Bug, P]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$1], $f0=[true])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[true], joinType=[left])
>   LogicalTableScan(table=[[Bug, Q]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$0], $f0=[true])
>   LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY t0.Z) AS t3 ON P.X = t3.Z
> WHERE t3.$f1 IS NULL
> {code}
> which yields the (non-empty) table P as the result.
> Hence, the parsed and converted query is not equivalent to the input query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272818#comment-17272818
 ] 

Ruben Q L commented on CALCITE-3221:


You're right [~vladimirsitnikov], I have updated the description.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-3221:
---
Description: 
Currently, the union operation offered by Calcite (see 
[EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
 "breaks" the collation (if any) of its inputs.

The goal of this issue is to create a new union algorithm 
(EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
same collation, will return the union / union all result respecting this 
collation.

Most likely the implementation of the merge join can be useful.

  was:
Currently, the union operation offered by Calcite (see 
[EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
 "breaks" the collation (if any) of its inputs.

The goal of this issue is to create a new union algorithm 
(EnumerableMergeUnion) that, given the fact that its inputs are sorted by a 
certain collation, will return the union / union all result respecting this 
collation.

Most likely the implementation of the merge join can be useful.


> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by the 
> same collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4480) Make EnumerableDefaults#union a non-blocking operation

2021-01-27 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4480:
--

 Summary: Make EnumerableDefaults#union a non-blocking operation
 Key: CALCITE-4480
 URL: https://issues.apache.org/jira/browse/CALCITE-4480
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.26.0
Reporter: Vladimir Sitnikov


Currently, EnumerableDefaults#union buffers all the rows before it returns the 
first of them

Pros:
1) Faster iteration in case enumerable is queried multiple times

Cons:
1) The implementation does not work with infinite streams
2) Keeps memory even after iteration is finished

---

An alternative might be something like

{code:java}
  public static  Enumerable union(Enumerable source0,
  Enumerable source1) {
Enumerable unionAll = concat(source0, source1);
return new AbstractEnumerable() {
  @Override public Enumerator enumerator() {
Set set = new HashSet<>();
return EnumerableDefaults.where(unionAll, set::add).enumerator();
  }
};
  }
{code}

Pros:
1) Supports infinite streams
2) In theory, it could reset hashSet after iteration finishes

Cons:
1) Slower iteration in case enumerable is queried multiple times (hashSet is 
rebuilt every time)
2) concat+abstractenumerable might const CPU cycles





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-3221:
---
Description: 
Currently, the union operation offered by Calcite (see 
[EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
 "breaks" the collation (if any) of its inputs.

The goal of this issue is to create a new union algorithm 
(EnumerableMergeUnion) that, given the fact that its inputs are sorted by a 
certain collation, will return the union / union all result respecting this 
collation.

Most likely the implementation of the merge join can be useful.

  was:
Currently, the union operation offered by Calcite is based on a {{HashSet}} 
(see 
[EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
 and necessitates reading in memory all rows before returning a single result.  
 

Apart from increased memory consumption the operator is blocking and also 
destroys the order of its inputs.  

The goal of this issue is to add a new union algorithm (EnumerableMergeUnion ?) 
exploiting the fact that the inputs are sorted which consumes less memory and 
retains the order of its inputs.   

Most likely the implementation of the merge join can be useful.


> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm 
> (EnumerableMergeUnion) that, given the fact that its inputs are sorted by a 
> certain collation, will return the union / union all result respecting this 
> collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272805#comment-17272805
 ] 

Vladimir Sitnikov edited comment on CALCITE-3221 at 1/27/21, 12:02 PM:
---

{quote} But this does not collide with the current ticket's purpose: a new 
MergeUnion operator that keeps the collation from its (sorted) inputs{quote}
Please update the issue description so the intention matches the 
implementation. Currently, the half of the description mentions 
Enumerable#union which is misleading


was (Author: vladimirsitnikov):
{quote} But this does not collide with the current ticket's purpose: a new 
MergeUnion operator that keeps the collation from its (sorted) inputs{quote}
Please update the issue description so the intention matches the implementation.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272805#comment-17272805
 ] 

Vladimir Sitnikov commented on CALCITE-3221:


{quote} But this does not collide with the current ticket's purpose: a new 
MergeUnion operator that keeps the collation from its (sorted) inputs{quote}
Please update the issue description so the intention matches the implementation.

> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3221) Add a sort-merge union algorithm

2021-01-27 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272786#comment-17272786
 ] 

Ruben Q L commented on CALCITE-3221:


I agree with [~julianhyde]: [~vladimirsitnikov]'s suggestion is a valid 
proposal for a (separate) optimization of {{EnumerableDefaults#union}}. But 
this does not collide with the current ticket's purpose: a new MergeUnion 
operator that keeps the collation from its (sorted) inputs.

BTW, I think the PR is in a good shape, if anybody else wants to take a final 
look, please go ahead. Otherwise I plan to merge it in the coming days.


> Add a sort-merge union algorithm
> 
>
> Key: CALCITE-3221
> URL: https://issues.apache.org/jira/browse/CALCITE-3221
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Ruben Q L
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by Calcite is based on a {{HashSet}} 
> (see 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  and necessitates reading in memory all rows before returning a single 
> result.   
> Apart from increased memory consumption the operator is blocking and also 
> destroys the order of its inputs.  
> The goal of this issue is to add a new union algorithm (EnumerableMergeUnion 
> ?) exploiting the fact that the inputs are sorted which consumes less memory 
> and retains the order of its inputs.   
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4479) "vFloat in (1.0, 2.0)" throws UnsupportedOperationException

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4479:

Labels: pull-request-available  (was: )

> "vFloat in (1.0, 2.0)" throws UnsupportedOperationException
> ---
>
> Key: CALCITE-4479
> URL: https://issues.apache.org/jira/browse/CALCITE-4479
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.26.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.27.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check this test in {{RexBuilderTest}}:
> {code:java}
> @Test void testMakeIn() {
> final RelDataTypeFactory typeFactory =
> new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
> final RexBuilder rexBuilder = new RexBuilder(typeFactory);
> final RelDataType floatType = 
> typeFactory.createSqlType(SqlTypeName.FLOAT);
> RexNode left = rexBuilder.makeInputRef(floatType, 0);
> final RexNode literal1 = rexBuilder.makeLiteral(1.0f, floatType);
> final RexNode literal2 = rexBuilder.makeLiteral(2.0f, floatType);
> RexNode inCall = rexBuilder.makeIn(left, ImmutableList.of(literal1, 
> literal2));
> assertThat(inCall.getKind(), is(SqlKind.SEARCH));
>   }
> {code}
> The stacktrace is:
> {noformat}
> class org.apache.calcite.sql.type.SqlTypeName: FLOAT
> java.lang.UnsupportedOperationException: class 
> org.apache.calcite.sql.type.SqlTypeName: FLOAT
>   at org.apache.calcite.util.Util.needToImplement(Util.java:1085)
>   at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:726)
>   at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
>   at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
>   at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
>   at org.apache.calcite.rex.RexLiteral.toLiteral(RexLiteral.java:762)
>   at 
> org.apache.calcite.rex.RexLiteral.lambda$printSarg$4(RexLiteral.java:733)
>   at 
> org.apache.calcite.util.RangeSets$Printer.singleton(RangeSets.java:409)
>   at org.apache.calcite.util.RangeSets.forEach(RangeSets.java:249)
>   at org.apache.calcite.util.Sarg.lambda$printTo$0(Sarg.java:119)
>   at org.apache.calcite.linq4j.Ord.forEach(Ord.java:157)
>   at org.apache.calcite.util.Sarg.printTo(Sarg.java:115)
>   at org.apache.calcite.rex.RexLiteral.printSarg(RexLiteral.java:732)
>   at 
> org.apache.calcite.rex.RexLiteral.lambda$appendAsJava$1(RexLiteral.java:673)
>   at org.apache.calcite.util.Util.asStringBuilder(Util.java:2525)
>   at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:672)
>   at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
>   at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
>   at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
>   at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:990)
>   at 
> org.apache.calcite.rex.RexBuilder.makeSearchArgumentLiteral(RexBuilder.java:1085)
>   at org.apache.calcite.rex.RexBuilder.makeIn(RexBuilder.java:1335)
>   at 
> org.apache.calcite.rex.RexBuilderTest.testMakeIn(RexBuilderTest.java:621)
> {noformat}
> The root cause is that {{RexLiteral#strictTypeName}} has different type name 
> strategies with what {{RexBuilder.makeLiteral}} follows, the best way to fix 
> is to make the rules synced, but here i only give a simple fix because the 
> code path only used for Sarg digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4479) "vFloat in (1.0, 2.0)" throws UnsupportedOperationException

2021-01-27 Thread Danny Chen (Jira)
Danny Chen created CALCITE-4479:
---

 Summary: "vFloat in (1.0, 2.0)" throws 
UnsupportedOperationException
 Key: CALCITE-4479
 URL: https://issues.apache.org/jira/browse/CALCITE-4479
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.26.0
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 1.27.0


Check this test in {{RexBuilderTest}}:

{code:java}
@Test void testMakeIn() {
final RelDataTypeFactory typeFactory =
new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
final RexBuilder rexBuilder = new RexBuilder(typeFactory);
final RelDataType floatType = typeFactory.createSqlType(SqlTypeName.FLOAT);
RexNode left = rexBuilder.makeInputRef(floatType, 0);
final RexNode literal1 = rexBuilder.makeLiteral(1.0f, floatType);
final RexNode literal2 = rexBuilder.makeLiteral(2.0f, floatType);
RexNode inCall = rexBuilder.makeIn(left, ImmutableList.of(literal1, 
literal2));
assertThat(inCall.getKind(), is(SqlKind.SEARCH));
  }
{code}

The stacktrace is:

{noformat}
class org.apache.calcite.sql.type.SqlTypeName: FLOAT
java.lang.UnsupportedOperationException: class 
org.apache.calcite.sql.type.SqlTypeName: FLOAT
at org.apache.calcite.util.Util.needToImplement(Util.java:1085)
at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:726)
at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
at org.apache.calcite.rex.RexLiteral.toLiteral(RexLiteral.java:762)
at 
org.apache.calcite.rex.RexLiteral.lambda$printSarg$4(RexLiteral.java:733)
at 
org.apache.calcite.util.RangeSets$Printer.singleton(RangeSets.java:409)
at org.apache.calcite.util.RangeSets.forEach(RangeSets.java:249)
at org.apache.calcite.util.Sarg.lambda$printTo$0(Sarg.java:119)
at org.apache.calcite.linq4j.Ord.forEach(Ord.java:157)
at org.apache.calcite.util.Sarg.printTo(Sarg.java:115)
at org.apache.calcite.rex.RexLiteral.printSarg(RexLiteral.java:732)
at 
org.apache.calcite.rex.RexLiteral.lambda$appendAsJava$1(RexLiteral.java:673)
at org.apache.calcite.util.Util.asStringBuilder(Util.java:2525)
at org.apache.calcite.rex.RexLiteral.appendAsJava(RexLiteral.java:672)
at org.apache.calcite.rex.RexLiteral.toJavaString(RexLiteral.java:427)
at org.apache.calcite.rex.RexLiteral.computeDigest(RexLiteral.java:289)
at org.apache.calcite.rex.RexLiteral.(RexLiteral.java:233)
at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:990)
at 
org.apache.calcite.rex.RexBuilder.makeSearchArgumentLiteral(RexBuilder.java:1085)
at org.apache.calcite.rex.RexBuilder.makeIn(RexBuilder.java:1335)
at 
org.apache.calcite.rex.RexBuilderTest.testMakeIn(RexBuilderTest.java:621)
{noformat}

The root cause is that {{RexLiteral#strictTypeName}} has different type name 
strategies with what {{RexBuilder.makeLiteral}} follows, the best way to fix is 
to make the rules synced, but here i only give a simple fix because the code 
path only used for Sarg digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)