date:20200219

[jira] [Commented] (CALCITE-3802) Calcite Elasticsearch adapter should encode the URI before send the request

2020-02-19 Thread jerryleooo (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040714#comment-17040714
 ] 

jerryleooo commented on CALCITE-3802:
-

Noted, removed~

> Calcite Elasticsearch adapter should encode the URI before send the request
> ---
>
> Key: CALCITE-3802
> URL: https://issues.apache.org/jira/browse/CALCITE-3802
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Affects Versions: 1.21.0
>Reporter: jerryleooo
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2020-02-17-16-09-20-794.png
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> In 
> [https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchTransport.java#L121]
>  when the indexName has some special characters, the request will fail for 
> Elasticsearch server returning HTTP 400 error.
>  
> {code:java}
> val connection = DriverManager.getConnection("jdbc:calcite:") 
> val calciteConnection = connection.asInstanceOf[CalciteConnection] 
> val rootSchema = calciteConnection.getRootSchema()
> val esProperties = new util.HashMap[String, AnyRef]()
> esProperties.put("coordinates", "{'elasticsearch url': 80}") 
> rootSchema.add("es", new ElasticsearchSchemaFactory().create(rootSchema, 
> "es", esProperties))
> {code}
> !image-2020-02-17-16-09-20-794.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3802) Calcite Elasticsearch adapter should encode the URI before send the request

2020-02-19 Thread jerryleooo (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jerryleooo updated CALCITE-3802:

Fix Version/s: (was: 1.22.0)

> Calcite Elasticsearch adapter should encode the URI before send the request
> ---
>
> Key: CALCITE-3802
> URL: https://issues.apache.org/jira/browse/CALCITE-3802
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Affects Versions: 1.21.0
>Reporter: jerryleooo
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2020-02-17-16-09-20-794.png
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> In 
> [https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchTransport.java#L121]
>  when the indexName has some special characters, the request will fail for 
> Elasticsearch server returning HTTP 400 error.
>  
> {code:java}
> val connection = DriverManager.getConnection("jdbc:calcite:") 
> val calciteConnection = connection.asInstanceOf[CalciteConnection] 
> val rootSchema = calciteConnection.getRootSchema()
> val esProperties = new util.HashMap[String, AnyRef]()
> esProperties.put("coordinates", "{'elasticsearch url': 80}") 
> rootSchema.add("es", new ElasticsearchSchemaFactory().create(rootSchema, 
> "es", esProperties))
> {code}
> !image-2020-02-17-16-09-20-794.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3802) Calcite Elasticsearch adapter should encode the URI before send the request

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040678#comment-17040678
 ] 

Danny Chen commented on CALCITE-3802:
-

The 1.22.0 is going to release soon, i would suggest to remove the fix version 
tag if you do not have strong requests to let  the patch into 1.22.0.

> Calcite Elasticsearch adapter should encode the URI before send the request
> ---
>
> Key: CALCITE-3802
> URL: https://issues.apache.org/jira/browse/CALCITE-3802
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Affects Versions: 1.21.0
>Reporter: jerryleooo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
> Attachments: image-2020-02-17-16-09-20-794.png
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> In 
> [https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchTransport.java#L121]
>  when the indexName has some special characters, the request will fail for 
> Elasticsearch server returning HTTP 400 error.
>  
> {code:java}
> val connection = DriverManager.getConnection("jdbc:calcite:") 
> val calciteConnection = connection.asInstanceOf[CalciteConnection] 
> val rootSchema = calciteConnection.getRootSchema()
> val esProperties = new util.HashMap[String, AnyRef]()
> esProperties.put("coordinates", "{'elasticsearch url': 80}") 
> rootSchema.add("es", new ElasticsearchSchemaFactory().create(rootSchema, 
> "es", esProperties))
> {code}
> !image-2020-02-17-16-09-20-794.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-3737) HOP Table-valued Function

2020-02-19 Thread Rui Wang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040664#comment-17040664
 ] 

Rui Wang edited comment on CALCITE-3737 at 2/20/20 6:29 AM:


[~danny0405]

It's less likely be in 1.22.0. Remove 1.22 from the fix version.


was (Author: amaliujia):
[~danny0405]

It's less likely been in 1.22.0. Remove 1.22 from the fix version.

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3737) HOP Table-valued Function

2020-02-19 Thread Rui Wang (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040664#comment-17040664
 ] 

Rui Wang commented on CALCITE-3737:
---

[~danny0405]

It's less likely been in 1.22.0. Remove 1.22 from the fix version.

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3737) HOP Table-valued Function

2020-02-19 Thread Rui Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated CALCITE-3737:
--
Fix Version/s: (was: 1.22.0)

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3802) Calcite Elasticsearch adapter should encode the URI before send the request

2020-02-19 Thread jerryleooo (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040661#comment-17040661
 ] 

jerryleooo commented on CALCITE-3802:
-

[~danny0405] sorry, I am not very familiar with the routine, since 1.21 has 
been released, so I guess this can be fixed in 1.22. Maybe this need further 
discussion? 

> Calcite Elasticsearch adapter should encode the URI before send the request
> ---
>
> Key: CALCITE-3802
> URL: https://issues.apache.org/jira/browse/CALCITE-3802
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Affects Versions: 1.21.0
>Reporter: jerryleooo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
> Attachments: image-2020-02-17-16-09-20-794.png
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> In 
> [https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchTransport.java#L121]
>  when the indexName has some special characters, the request will fail for 
> Elasticsearch server returning HTTP 400 error.
>  
> {code:java}
> val connection = DriverManager.getConnection("jdbc:calcite:") 
> val calciteConnection = connection.asInstanceOf[CalciteConnection] 
> val rootSchema = calciteConnection.getRootSchema()
> val esProperties = new util.HashMap[String, AnyRef]()
> esProperties.put("coordinates", "{'elasticsearch url': 80}") 
> rootSchema.add("es", new ElasticsearchSchemaFactory().create(rootSchema, 
> "es", esProperties))
> {code}
> !image-2020-02-17-16-09-20-794.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3762) AggregateProjectPullUpConstantsRule causes fields to be out of order

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3762:

Fix Version/s: (was: 1.22.0)

> AggregateProjectPullUpConstantsRule causes fields to be out of order
> 
>
> Key: CALCITE-3762
> URL: https://issues.apache.org/jira/browse/CALCITE-3762
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.0, 1.16.0
>Reporter: hezhang
>Priority: Major
> Attachments: calcite-3762.patch, image-2020-02-01-01-29-49-479.png, 
> image-2020-02-01-01-33-54-111.png
>
>
> the sql:
> {code:java}
> select * FROM( SELECT plat, category, rid, populary_num FROM 
> panda_com.crawler_anchor WHERE
>  par_date = '20180819'
>  AND plat = 'huya'
>  AND rid = 'meijiao'
> ) a
>  JOIN
>  (
>  SELECT DISTINCT
>  'huya' plat ,
>  edwin.privatehost ,
>  edwin.profileroom
>  FROM
>  panda_com.ol_huya_isOnline edwin
>  WHERE
>  par_date = '20180819' ) m9
>  ON
>  a.rid= m9.privatehost
>  AND a.plat = m9.plat{code}
> the result:
>  
> {code:java}
> huya yule meijiao 30 huya 10001242 meijiao
> {code}
>  
> but the desired result is:
>  
> {code:java}
> huya yule meijiao 30 huya meijiao 10001242  
> {code}
>  
> *cause:*
> hepPlanner use AggregateProjectPullUpConstantsRule:
> !image-2020-02-01-01-29-49-479.png!
>  after add fix patch:
> !image-2020-02-01-01-33-54-111.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-2589) VolcanoPlanner#fireRules and VolcanoRuleCall#matchRecurse should ignore known-to-be-unimportant relations

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-2589:

Fix Version/s: (was: 1.22.0)

> VolcanoPlanner#fireRules and VolcanoRuleCall#matchRecurse should ignore 
> known-to-be-unimportant relations
> -
>
> Key: CALCITE-2589
> URL: https://issues.apache.org/jira/browse/CALCITE-2589
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>
> {{call.getPlanner().setImportance}} is used to avoid use of 
> known-to-be-inefficient relation, however the check of importance is 
> performed very late.
> The check is performed in org.apache.calcite.plan.volcano.RuleQueue#skipMatch 
> when ruleCalls have already been created.
> I suggest to move the check into VolcanoPlanner#fireRules and 
> VolcanoRuleCall#matchRecurse
> It would reduce amount of "possible" rule executions.
> Note: calling setImportance BEFORE transformTo would would help as well to 
> filter out unimportant rule calls early.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3576) Remove Enumerable convention check in FilterIntoJoinRule

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3576:

Fix Version/s: (was: 1.22.0)

> Remove Enumerable convention check in FilterIntoJoinRule
> 
>
> Key: CALCITE-3576
> URL: https://issues.apache.org/jira/browse/CALCITE-3576
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3224) New RexNode-to-Expression CodeGen Implementation

2020-02-19 Thread Feng Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040658#comment-17040658
 ] 

Feng Zhu commented on CALCITE-3224:
---

The codegen component is not intuitive, and it seems that committers are busy 
recently.

Therefore, it may be difficult to be resolved in release 1.22. I think we can 
change the version to 1.23 first.

> New RexNode-to-Expression CodeGen Implementation
> 
>
> Key: CALCITE-3224
> URL: https://issues.apache.org/jira/browse/CALCITE-3224
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Feng Zhu
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.22.0
>
> Attachments: RexNode-CodeGen.pdf, codegen.png
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> h3. *Background*
>     Current RexNode-to-Expression implementation relies on BlockBuilder's 
> incorrect “optimizations” to inline unsafe operations. As illustrated in 
> CALCITE-3173, when this cooperation is broken in some special cases, it will 
> cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.
>     Though we can fix these problems under current implementation framework 
> with some efforts like the PR in CALCITE-3142, the logic will become more and 
> more complex. To pursue a thorough and elegant solution, we implement a new 
> one. Moreover, it also ensures the correctness for non-optimized code.
> h3. *Major Features*
>  * *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up 
> way, rather than recursively visiting a RexNode many times with different 
> NullAs settings.
>  * *Conditional Semantic*: It can naturally guarantee the correctness even 
> without BlockBuilder’s “optimizings”. Each line of code generated for a 
> RexNode is null safe.
>  * *Interface Compatibility*: The implementation only updates 
> _RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor 
> keep unchanged.
> h3. *Implementation*
>     For each RexNode, the visitor will generally generate two declaration 
> statements, one for value and one for nullable. The code snippet is like:
> {code:java}
> {valueVariable} = {valueExpression}
> {isNullVariable} = {isNullExpression}
> {code}
> The visitor’s result will be the variable pair (*_isNullVariable_*, 
> *_valueVariable_*).
> *Other changes:*
> (1) ReImplement different RexCall implementations (e.g., CastImplementor, 
> BinaryImplementor and etc.) as seperated files and remove them into the newly 
> created package _org.apache.calcite.adapter.enumerable.rex,_ and organize 
> them in RexCallImpTable.
> (2) move some util functions into EnumUtils.
> h3. *Example Demonstration*
> Take a simple test case as example, in which the "commission" column is 
> nullable.
> {code:java}
> @Test public void testNPE() {
>   CalciteAssert.hr()
>     .query("select \"commission\" + 10 as s\n"
>   + "from \"hr\".\"emps\"")
>     .returns("S=1010\nS=510\nS=null\nS=260\n");
> }
> {code}
> The codegen progress and non-optimized code are demonstrated in the figure 
> below.
> !codegen.png!
>  # When visiting *RexInputRef (commission)*, the visitor generates three 
> lines of code, the result is a pair of ParameterExpression (*_input_isNull_*, 
> *_input_value_*).
>  # Then the visitor visits *RexLiteral (10)* and generates two lines of code. 
> The result is (*_literal_isNull_*, *_literal_value_*).
>  # After that, when visiting *RexCall(Add)*, (_*input_isNull*_, 
> _*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to 
> implement the logic. The visitor also generates two lines of code and return 
> the variable pair.
> In the end, the result Expression is constructed based on 
> (_*binary_call_isNull*_, _*binary_call_value*_)
> [^RexNode-CodeGen.pdf]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-1581) UDTF like in hive

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040656#comment-17040656
 ] 

Danny Chen commented on CALCITE-1581:
-

Remove the fix version tag because it seems impossible to be resolved in 
release 1.22.

> UDTF like in hive
> -
>
> Key: CALCITE-1581
> URL: https://issues.apache.org/jira/browse/CALCITE-1581
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Xiaoyong Deng
>Assignee: pengzhiwei
>Priority: Major
>  Labels: pull-request-available, udtf
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Support one row in and multi-column/multi-row out(one-to-many mapping), just 
> like udtf in hive.
> The query would like this:
> {code}
> select
>   func(c0, c1) as (f0, f1, f2)
> from table_name;
> {code}
> c0 and c1 are 'table_name' columns. f0, f1 and f2 are new generated columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3592) Implement BITNOT scalar function

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040657#comment-17040657
 ] 

Danny Chen commented on CALCITE-3592:
-

Is this issue going to be resolved in release 1.22 ?

> Implement BITNOT scalar  function
> -
>
> Key: CALCITE-3592
> URL: https://issues.apache.org/jira/browse/CALCITE-3592
> Project: Calcite
>  Issue Type: Sub-task
>  Components: babel
>Affects Versions: 1.21.0
>Reporter: hailong wang
>Assignee: hailong wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement BITNOT scalar function which support tinyint, smallint, int, 
> bigint, binary and varbinary type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-1581) UDTF like in hive

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-1581:

Fix Version/s: (was: 1.22.0)

> UDTF like in hive
> -
>
> Key: CALCITE-1581
> URL: https://issues.apache.org/jira/browse/CALCITE-1581
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Xiaoyong Deng
>Assignee: pengzhiwei
>Priority: Major
>  Labels: pull-request-available, udtf
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Support one row in and multi-column/multi-row out(one-to-many mapping), just 
> like udtf in hive.
> The query would like this:
> {code}
> select
>   func(c0, c1) as (f0, f1, f2)
> from table_name;
> {code}
> c0 and c1 are 'table_name' columns. f0, f1 and f2 are new generated columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3142) An NPE when rounding a nullable numeric

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040654#comment-17040654
 ] 

Danny Chen commented on CALCITE-3142:
-

Remove the fix version tag because it seems impossible to be resolved in 
release 1.22.

> An NPE when rounding a nullable numeric
> ---
>
> Key: CALCITE-3142
> URL: https://issues.apache.org/jira/browse/CALCITE-3142
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Mohamed Mohsen
>Assignee: Feng Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: newcodegen.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The following query throws a NPE in the generated code because it assumes the 
> divided value to be an initialized Java object (Not null), which is fine for 
> the first row, but not for the second.
> {code:sql}
> SELECT ROUND(CAST((X/Y) AS NUMERIC), 2) FROM (VALUES (1, 2), (NULLIF(5, 5), 
> NULLIF(5, 5))) A(X, Y){code}
> If I modify the query a little bit, it runs ok:
>  – No casting
> {code:sql}
> SELECT ROUND((X/Y), 2) FROM (VALUES (1, 2), (NULLIF(5, 5), NULLIF(5, 5))) 
> A(X, Y){code}
> – No rounding
> {code:sql}
> SELECT (X/Y)::NUMERIC FROM (VALUES (1, 2), (NULLIF(5, 5), NULLIF(5, 5))) A(X, 
> Y){code}
> +This is the optimized generated code+
> {code:java}
> final Object[] current = (Object[]) inputEnumerator.current();
> final Integer inp0_ = (Integer) current[0];
> final Integer inp1_ = (Integer) current[1];
> final java.math.BigDecimal v1 = new java.math.BigDecimal(
>   inp0_.intValue() / inp1_.intValue()); // <<< NPE
> return inp0_ == null || inp1_ == null ? (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.sround(v1, 2);{code}
> +This is the non-optimized one+
> {code:java}
> final Object[] current = (Object[]) inputEnumerator.current();
> final Integer inp0_ = (Integer) current[0];
> final boolean inp0__unboxed = inp0_ == null;
> final Integer inp1_ = (Integer) current[1];
> final boolean inp1__unboxed = inp1_ == null;
> final boolean v = inp0__unboxed || inp1__unboxed;
> final int inp0__unboxed0 = inp0_.intValue(); // <<< NPE
> final int inp1__unboxed0 = inp1_.intValue(); // <<< NPE
> final int v0 = inp0__unboxed0 / inp1__unboxed0;
> final java.math.BigDecimal v1 = new java.math.BigDecimal(
>   v0);
> final java.math.BigDecimal v2 = v ? (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.sround(v1, 2);
> return v2;{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3142) An NPE when rounding a nullable numeric

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3142:

Fix Version/s: (was: 1.22.0)

> An NPE when rounding a nullable numeric
> ---
>
> Key: CALCITE-3142
> URL: https://issues.apache.org/jira/browse/CALCITE-3142
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Mohamed Mohsen
>Assignee: Feng Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: newcodegen.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The following query throws a NPE in the generated code because it assumes the 
> divided value to be an initialized Java object (Not null), which is fine for 
> the first row, but not for the second.
> {code:sql}
> SELECT ROUND(CAST((X/Y) AS NUMERIC), 2) FROM (VALUES (1, 2), (NULLIF(5, 5), 
> NULLIF(5, 5))) A(X, Y){code}
> If I modify the query a little bit, it runs ok:
>  – No casting
> {code:sql}
> SELECT ROUND((X/Y), 2) FROM (VALUES (1, 2), (NULLIF(5, 5), NULLIF(5, 5))) 
> A(X, Y){code}
> – No rounding
> {code:sql}
> SELECT (X/Y)::NUMERIC FROM (VALUES (1, 2), (NULLIF(5, 5), NULLIF(5, 5))) A(X, 
> Y){code}
> +This is the optimized generated code+
> {code:java}
> final Object[] current = (Object[]) inputEnumerator.current();
> final Integer inp0_ = (Integer) current[0];
> final Integer inp1_ = (Integer) current[1];
> final java.math.BigDecimal v1 = new java.math.BigDecimal(
>   inp0_.intValue() / inp1_.intValue()); // <<< NPE
> return inp0_ == null || inp1_ == null ? (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.sround(v1, 2);{code}
> +This is the non-optimized one+
> {code:java}
> final Object[] current = (Object[]) inputEnumerator.current();
> final Integer inp0_ = (Integer) current[0];
> final boolean inp0__unboxed = inp0_ == null;
> final Integer inp1_ = (Integer) current[1];
> final boolean inp1__unboxed = inp1_ == null;
> final boolean v = inp0__unboxed || inp1__unboxed;
> final int inp0__unboxed0 = inp0_.intValue(); // <<< NPE
> final int inp1__unboxed0 = inp1_.intValue(); // <<< NPE
> final int v0 = inp0__unboxed0 / inp1__unboxed0;
> final java.math.BigDecimal v1 = new java.math.BigDecimal(
>   v0);
> final java.math.BigDecimal v2 = v ? (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.sround(v1, 2);
> return v2;{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3369) In LatticeSuggester, recommend lattices based on UNION queries

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3369:

Fix Version/s: (was: 1.22.0)

> In LatticeSuggester, recommend lattices based on UNION queries
> --
>
> Key: CALCITE-3369
> URL: https://issues.apache.org/jira/browse/CALCITE-3369
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>
> In LatticeSuggester, recommend lattices based on UNION, EXCEPT and INTERSECT 
> queries. Currently such queries are ignored.
> Given the query
> {code:java}
> select * from t1 join t2
> union
> select * from t2 join t3;{code}
> suggester should generate the same lattice(s) as if it had been given two 
> separate queries
> {code:java}
> select * from t1 join t2;
> select * from t2 join t3; {code}
> Which may be a single lattice t1 - t2 - t3, or might be two lattices t1 - t2, 
> t2 - t3.
> Same for EXCEPT (MINUS), INTERSECT, UNION ALL, etc.
> If the set-op is internal, I'm not sure what to do, e.g.
> {code:java}
> select *
> from sales
> join (select * from good_product
>   union
>   select * from bad_product) using (product_id){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3128:

Fix Version/s: (was: 1.22.0)

> Joining two tables producing only NULLs will return 0 rows
> --
>
> Key: CALCITE-3128
> URL: https://issues.apache.org/jira/browse/CALCITE-3128
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Mohamed Mohsen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The following queries will return 0 rows while they're expected to ruturn 
> rows with NULLs in them.
> {code:sql}
> SELECT *
> FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b
> {code}
> {code:sql}
> SELECT *
> FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), 
> (NULLIF(5, 5))) b
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CALCITE-3672) Support implicit type coercion for insert and update

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved CALCITE-3672.
-
Resolution: Fixed

Fixed in 
[5e4d396|https://github.com/apache/calcite/commit/5e4d3967fc2f1c14ccf59c521054cfd4fe1e5b54]
 !

> Support implicit type coercion for insert and update
> 
>
> Key: CALCITE-3672
> URL: https://issues.apache.org/jira/browse/CALCITE-3672
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (CALCITE-3412) FLOOR(timestamp TO WEEK) gives wrong result

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen closed CALCITE-3412.
---
Resolution: Fixed

Close because it is already fixed.

> FLOOR(timestamp TO WEEK) gives wrong result
> ---
>
> Key: CALCITE-3412
> URL: https://issues.apache.org/jira/browse/CALCITE-3412
> Project: Calcite
>  Issue Type: Bug
>  Components: avatica, core
>Reporter: huaicui
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.22.0, avatica-1.17.0
>
> Attachments: image-2019-10-15-13-33-34-896.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Floor timestamp to week exist error:
> The start day of the week is not Sunday or Monday, calcite use Thursday to 
> split a week.
> Example:
> sql: select FLOOR(CAST('2017-01-28' AS TIMESTAMP) TO WEEK);
> Response:
> 2017-01-26 00:00:00.0
>  
> 2017-01-26 is Thursday that is not our expect result. As result, 2017-01-22 
> is the first day that we expect.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3659) Optimize EnumerableMergeJoin: avoid creating Linq4j.product for each matching group

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3659:

Fix Version/s: (was: 1.22.0)

> Optimize EnumerableMergeJoin: avoid creating Linq4j.product for each matching 
> group
> ---
>
> Key: CALCITE-3659
> URL: https://issues.apache.org/jira/browse/CALCITE-3659
> Project: Calcite
>  Issue Type: Improvement
>  Components: linq4j
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Assignee: Vladimir Sitnikov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3656) EnumerableNestedLoopJoin cost should account for cost of inner restarts

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3656:

Fix Version/s: (was: 1.22.0)

> EnumerableNestedLoopJoin cost should account for cost of inner restarts
> ---
>
> Key: CALCITE-3656
> URL: https://issues.apache.org/jira/browse/CALCITE-3656
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Vladimir Sitnikov
>Assignee: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> EnumerableNestedLoopJoin uses makeCost(rowCount, 0, 0) which does not account 
> the efforts spent in restarting the nested loops.
> For instance: select * from emp, dept where false produces 0 rows, however, 
> it still has non-trivial execution efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3150) NPE in UPPER when repeated and combine with LIKE

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040649#comment-17040649
 ] 

Danny Chen commented on CALCITE-3150:
-

Is this issue going to be resolved in release 1.22 ?

> NPE in UPPER when repeated and combine with LIKE 
> -
>
> Key: CALCITE-3150
> URL: https://issues.apache.org/jira/browse/CALCITE-3150
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: next
>Reporter: Mickaël Sauvée
>Assignee: Feng Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>
> Using a query using twice same UPPER with a LIKE, generated code do not 
> protect UPPER call.
> If inout is null, then UPPER is call and generates a NPE.
> I've used the following test (in JdbcTest.java):
>  
> {code:java}
> @Test
> public void testNPEInUpper() {
>   CalciteAssert.hr()
>   .query("select e.\"name\" from \"hr\".\"emps\" as e WHERE 
> (UPPER(e.\"name\") LIKE 'B%' AND UPPER(e.\"name\") LIKE '%L')" )
>   .returnsUnordered("name=Bill;");
> }
> {code}
> And modify data to hava NULL for a name:
>  
>  
> {code:java}
> public final Employee[] emps = {
>   new Employee(100, 10, "Bill", 1, 1000),
>   new Employee(200, 20, "Eric", 8000, 500),
>   new Employee(150, 10, null, 7000, null),
>   new Employee(110, 10, "Theodore", 11500, 250),
> };
> {code}
> This generates this code:
>  
>  
> {code:java}
> /*  11 */ public boolean moveNext() {
> /*  12 */   while (inputEnumerator.moveNext()) {
> /*  13 */ final String inp2_ = 
> ((org.apache.calcite.test.JdbcTest.Employee) inputEnumerator.current()).name;
> /*  14 */ final String v = 
> org.apache.calcite.runtime.SqlFunctions.upper(inp2_);
> /*  15 */ if (inp2_ != null && 
> org.apache.calcite.runtime.SqlFunctions.like(v, "B%") && (inp2_ != null && 
> org.apache.calcite.runtime.SqlFunctions.like(v, "%L"))) {
> /*  16 */   return true;
> /*  17 */ }
> /*  18 */   }
> /*  19 */   return false;
> /*  20 */ }
> {code}
>  
> The variable v is computed whenever inp2_ is null. My guess is that v should 
> not be inlined, or the function upper 
> org.apache.calcite.runtime.SqlFunctions.upper should support null as 
> parameter (and return null).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3782) Bitwise operator Bit_And, Bit_OR and Bit_XOR support binary and varbinary type

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3782:

Fix Version/s: (was: 1.22.0)

> Bitwise operator Bit_And, Bit_OR and Bit_XOR support binary and varbinary type
> --
>
> Key: CALCITE-3782
> URL: https://issues.apache.org/jira/browse/CALCITE-3782
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.21.0
>Reporter: hailong wang
>Assignee: hailong wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-2816) PsTableFunction fails in Russian locale

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-2816:

Fix Version/s: (was: 1.22.0)

> PsTableFunction fails in Russian locale
> ---
>
> Key: CALCITE-2816
> URL: https://issues.apache.org/jira/browse/CALCITE-2816
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> The issue is present for quite a long time.
> ps output is locale-dependent and parsing is not:
> {noformat}
> [ERROR] testPs(org.apache.calcite.adapter.os.OsAdapterTest) Time elapsed: 
> 3.018 s <<< ERROR!
> java.lang.RuntimeException: exception while executing [select * from ps]
> at org.apache.calcite.adapter.os.OsAdapterTest.testPs(OsAdapterTest.java:155)
> Caused by: java.lang.RuntimeException: while parsing value [0,1] of field 
> [pcpu] in line [ 0 1 1 0 Ss root 0,1 0,0 4367200 5420 ?? 18янв19 106:03.36 0 
> 0 0 /sbin/launchd]
> at 
> org.apache.calcite.adapter.os.OsAdapterTest.lambda$testPs$3(OsAdapterTest.java:157)
> at 
> org.apache.calcite.adapter.os.OsAdapterTest.testPs(OsAdapterTest.java:155){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-2816) PsTableFunction fails in Russian locale

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040648#comment-17040648
 ] 

Danny Chen commented on CALCITE-2816:
-

Remove the fix version tag because there is even no PR yet.

> PsTableFunction fails in Russian locale
> ---
>
> Key: CALCITE-2816
> URL: https://issues.apache.org/jira/browse/CALCITE-2816
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Vladimir Sitnikov
>Priority: Major
> Fix For: 1.22.0
>
>
> The issue is present for quite a long time.
> ps output is locale-dependent and parsing is not:
> {noformat}
> [ERROR] testPs(org.apache.calcite.adapter.os.OsAdapterTest) Time elapsed: 
> 3.018 s <<< ERROR!
> java.lang.RuntimeException: exception while executing [select * from ps]
> at org.apache.calcite.adapter.os.OsAdapterTest.testPs(OsAdapterTest.java:155)
> Caused by: java.lang.RuntimeException: while parsing value [0,1] of field 
> [pcpu] in line [ 0 1 1 0 Ss root 0,1 0,0 4367200 5420 ?? 18янв19 106:03.36 0 
> 0 0 /sbin/launchd]
> at 
> org.apache.calcite.adapter.os.OsAdapterTest.lambda$testPs$3(OsAdapterTest.java:157)
> at 
> org.apache.calcite.adapter.os.OsAdapterTest.testPs(OsAdapterTest.java:155){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3224) New RexNode-to-Expression CodeGen Implementation

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040647#comment-17040647
 ] 

Danny Chen commented on CALCITE-3224:
-

Is this issue going to be resolved in release 1.22 ?

> New RexNode-to-Expression CodeGen Implementation
> 
>
> Key: CALCITE-3224
> URL: https://issues.apache.org/jira/browse/CALCITE-3224
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Feng Zhu
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.22.0
>
> Attachments: RexNode-CodeGen.pdf, codegen.png
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> h3. *Background*
>     Current RexNode-to-Expression implementation relies on BlockBuilder's 
> incorrect “optimizations” to inline unsafe operations. As illustrated in 
> CALCITE-3173, when this cooperation is broken in some special cases, it will 
> cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.
>     Though we can fix these problems under current implementation framework 
> with some efforts like the PR in CALCITE-3142, the logic will become more and 
> more complex. To pursue a thorough and elegant solution, we implement a new 
> one. Moreover, it also ensures the correctness for non-optimized code.
> h3. *Major Features*
>  * *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up 
> way, rather than recursively visiting a RexNode many times with different 
> NullAs settings.
>  * *Conditional Semantic*: It can naturally guarantee the correctness even 
> without BlockBuilder’s “optimizings”. Each line of code generated for a 
> RexNode is null safe.
>  * *Interface Compatibility*: The implementation only updates 
> _RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor 
> keep unchanged.
> h3. *Implementation*
>     For each RexNode, the visitor will generally generate two declaration 
> statements, one for value and one for nullable. The code snippet is like:
> {code:java}
> {valueVariable} = {valueExpression}
> {isNullVariable} = {isNullExpression}
> {code}
> The visitor’s result will be the variable pair (*_isNullVariable_*, 
> *_valueVariable_*).
> *Other changes:*
> (1) ReImplement different RexCall implementations (e.g., CastImplementor, 
> BinaryImplementor and etc.) as seperated files and remove them into the newly 
> created package _org.apache.calcite.adapter.enumerable.rex,_ and organize 
> them in RexCallImpTable.
> (2) move some util functions into EnumUtils.
> h3. *Example Demonstration*
> Take a simple test case as example, in which the "commission" column is 
> nullable.
> {code:java}
> @Test public void testNPE() {
>   CalciteAssert.hr()
>     .query("select \"commission\" + 10 as s\n"
>   + "from \"hr\".\"emps\"")
>     .returns("S=1010\nS=510\nS=null\nS=260\n");
> }
> {code}
> The codegen progress and non-optimized code are demonstrated in the figure 
> below.
> !codegen.png!
>  # When visiting *RexInputRef (commission)*, the visitor generates three 
> lines of code, the result is a pair of ParameterExpression (*_input_isNull_*, 
> *_input_value_*).
>  # Then the visitor visits *RexLiteral (10)* and generates two lines of code. 
> The result is (*_literal_isNull_*, *_literal_value_*).
>  # After that, when visiting *RexCall(Add)*, (_*input_isNull*_, 
> _*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to 
> implement the logic. The visitor also generates two lines of code and return 
> the variable pair.
> In the end, the result Expression is constructed based on 
> (_*binary_call_isNull*_, _*binary_call_value*_)
> [^RexNode-CodeGen.pdf]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3143) Dividing NULLIF clause may cause Division by zero error

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040646#comment-17040646
 ] 

Danny Chen commented on CALCITE-3143:
-

Move forward to CALCITE-3142 and remove the fix version of this issue.

> Dividing NULLIF clause may cause Division by zero error
> ---
>
> Key: CALCITE-3143
> URL: https://issues.apache.org/jira/browse/CALCITE-3143
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Li Xian
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
>
> execution of query with pattern like below (column COL_A with type 
> BigDecimal) 
> {code:java}
> select case when a < 77 and a > 0 then 99 else 88 end from
> (
>   select SUM(COL_A) / nullif(SUM(0),0) as a
>   from SOME_TABLE group by COL_B
> )
> {code}
>  
> will cause error java.lang.ArithmeticException: Division by zero.
> And the generated code is like below. Division is executed on _input2 (which 
> is 0)
> {code:java}
> /* 154 */ public Object current() {
> /* 155 */ final Object[] current = (Object[]) inputEnumerator.current();
> /* 156 */ final java.math.BigDecimal inp1_ = current[1] == null ? 
> (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.toBigDecimal(current[1]);
> /* 157 */ final int inp2_ = 
> org.apache.calcite.runtime.SqlFunctions.toInt(current[2]);
> /* 158 */ final boolean v0 = inp2_ != 0;
> /* 159 */ final java.math.BigDecimal v2 = 
> org.apache.calcite.runtime.SqlFunctions.divide(inp1_, new 
> java.math.BigDecimal(
> /* 160 */ inp2_));
> /* 161 */ return new Object[] {
> /* 162 */ inp1_ != null && v0 && 
> org.apache.calcite.runtime.SqlFunctions.lt(v2, 
> $L4J$C$new_java_math_BigDecimal_77_) && (inp1_ != null && v0 && 
> org.apache.calcite.runtime.SqlFunctions.gt(v2, 
> $L4J$C$new_java_math_BigDecimal_0_)) ? 99 : 88,
> /* 163 */ current[2]};
> /* 164 */ }
> {code}
>  
>  
> And by tracing the code generation, I found that in 
> org.apache.calcite.adapter.enumerable.RexImpTable#implementNullSemantics
> {code:java}
> case FALSE:
>   // v0 != null && v1 != null && f(v0, v1)
>   for (Ord operand : Ord.zip(call.getOperands())) {
> if (translator.isNullable(operand.e)) {
>   list.add(
>   translator.translate(
>   operand.e, NullAs.IS_NOT_NULL));
>   translator = translator.setNullable(operand.e, false);
> }
>   }
>   list.add(implementCall(translator, call, implementor, nullAs));
>   return Expressions.foldAnd(list);
> {code}
> operand "SUM(COL_A) / nullif(SUM(0),0)" is set as nullable=false and caused 
> this error.
> My understanding is that since operands are translated as NullAs.IS_NOT_NULL, 
> it's then safe to evaluate them as nullable=false. But in my case, the 
> evaluation with nullable=false will make "nullif(SUM(0), 0)" to be translated 
> to 0, and cause problem on the division. 
> After comment out the below line
> {code:java}
> translator = translator.setNullable(operand.e, false);{code}
> the query will work. May I ask if it is ok to comment out that line? cause 
> that looks like solving my problem temporarily. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-2166) Cumulative cost of RelSubset.best RelNode is increased after calling RelSubset.propagateCostImprovements() for input RelNodes

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-2166:

Fix Version/s: (was: 1.22.0)

> Cumulative cost of RelSubset.best RelNode is increased after calling 
> RelSubset.propagateCostImprovements() for input RelNodes
> -
>
> Key: CALCITE-2166
> URL: https://issues.apache.org/jira/browse/CALCITE-2166
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Vova Vysotskyi
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> After calling {{RelSubset.propagateCostImprovements()}} cumulative cost of 
> {{RelSubset.best}} {{RelNode}} may be increased due to the increase of the 
> non-cumulative cost caused by changing of input best {{RelNode}}.
> To observe this issue, add this code:
> {code:java}
>   if (subset.best != null) {
> RelOptCost bestCost = getCost(subset.best, 
> RelMetadataQuery.instance());
> if (!subset.bestCost.equals(bestCost)) {
>   throw new AssertionError(
> "relSubset [" + subset.getDescription()
>   + "] has wrong best cost "
>   + subset.bestCost + ". Correct cost is " + bestCost);
> }
>   }
> {code}
> into {{VolcanoPlanner.validate()}} method (line 907).
> List of unit tests which fail with this check:
> {noformat}
> Failed tests: 
>   
> MaterializationTest.testJoinMaterializationUKFK9:1823->checkMaterialize:198->checkMaterialize:205->checkThatMaterialize:233
>  relSubset [rel#226287:Subset#8.ENUMERABLE.[]] has wrong best cost {221.5 
> rows, 128.25 cpu, 0.0 io}. Correct cost is {233.0 rows, 178.0 cpu, 0.0 io}
>   ScannableTableTest.testPFPushDownProjectFilterAggregateNested:279 relSubset 
> [rel#12950:Subset#5.ENUMERABLE.[]] has wrong best cost {63.8 rows, 62.308 
> cpu, 0.0 io}. Correct cost is {70.4 rows, 60.404 cpu, 0.0 io}
>   ScannableTableTest.testPFTableRefusesFilterCooperative:221 relSubset 
> [rel#13382:Subset#2.ENUMERABLE.[]] has wrong best cost {81.0 rows, 181.01 
> cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505 cpu, 0.0 io}
>   ScannableTableTest.testProjectableFilterableCooperative:148 relSubset 
> [rel#13611:Subset#2.ENUMERABLE.[]] has wrong best cost {81.0 rows, 181.01 
> cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505 cpu, 0.0 io}
>   ScannableTableTest.testProjectableFilterableNonCooperative:165 relSubset 
> [rel#13754:Subset#2.ENUMERABLE.[]] has wrong best cost {81.0 rows, 181.01 
> cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505 cpu, 0.0 io}
>   FrameworksTest.testUpdate:336->executeQuery:367 relSubset 
> [rel#22533:Subset#2.ENUMERABLE.any] has wrong best cost {19.5 rows, 37.75 
> cpu, 0.0 io}. Correct cost is {22.575 rows, 52.58 cpu, 0.0 io}
> {noformat}
> For the test {{MaterializationTest.testJoinMaterializationUKFK9}} initial 
> best plan was:
> {noformat}
> EnumerableProject(empid0=[$5], empid00=[$5], deptno0=[$7]): rowcount = 15.0, 
> cumulative cost = {15.0 rows, 45.0 cpu, 0.0 io}, id = 3989
>   EnumerableJoin(subset=[rel#3988:Subset#34.ENUMERABLE.[]], condition=[=($1, 
> $7)], joinType=[inner]): rowcount = 15.0, cumulative cost = {116.0 rows, 0.0 
> cpu, 0.0 io}, id = 4797
> EnumerableFilter(subset=[rel#4274:Subset#47.ENUMERABLE.[0]], 
> condition=[=(CAST($2):VARCHAR CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary", 'Bill')]): rowcount = 1.0, cumulative cost = {1.0 
> rows, 1.0 cpu, 0.0 io}, id = 16522
>   EnumerableTableScan(subset=[rel#158:Subset#11.ENUMERABLE.[0]], 
> table=[[hr, m0]]): rowcount = 1.0, cumulative cost = {0.0 rows, 1.0 cpu, 0.0 
> io}, id = 79
> EnumerableTableScan(subset=[rel#115:Subset#5.ENUMERABLE.[]], table=[[hr, 
> depts]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 
> io}, id = 62
> {noformat}
> Its cumulative cost is \{221.5 rows, 123.75 cpu, 0.0 io}
> After applying some rules it became:
> {noformat}
> EnumerableProject(empid0=[$3], empid00=[$3], deptno0=[$0]): rowcount = 2.25, 
> cumulative cost = {2.25 rows, 6.75 cpu, 0.0 io}, id = 4012
>   EnumerableFilter(subset=[rel#4007:Subset#41.ENUMERABLE.[]], 
> condition=[=(CAST($2):VARCHAR CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary", 'Bill')]): rowcount = 2.25, cumulative cost = 
> {2.25 rows, 15.0 cpu, 0.0 io}, id = 4811
> EnumerableProject(subset=[rel#4203:Subset#61.ENUMERABLE.[]], deptno=[$7], 
> deptno0=[$1], name0=[$2], empid0=[$5]): rowcount = 15.0, cumulative cost = 
> {15.0 rows, 60.0 cpu, 0.0 io}, id = 4206
>   EnumerableJoin(subset=[rel#4204:Subset#52.ENUMERABLE.[]], 
> condition=[=($1, $7)], joinType=[inner]): rowcount

[jira] [Updated] (CALCITE-3143) Dividing NULLIF clause may cause Division by zero error

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3143:

Fix Version/s: (was: 1.22.0)

> Dividing NULLIF clause may cause Division by zero error
> ---
>
> Key: CALCITE-3143
> URL: https://issues.apache.org/jira/browse/CALCITE-3143
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Li Xian
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
>
> execution of query with pattern like below (column COL_A with type 
> BigDecimal) 
> {code:java}
> select case when a < 77 and a > 0 then 99 else 88 end from
> (
>   select SUM(COL_A) / nullif(SUM(0),0) as a
>   from SOME_TABLE group by COL_B
> )
> {code}
>  
> will cause error java.lang.ArithmeticException: Division by zero.
> And the generated code is like below. Division is executed on _input2 (which 
> is 0)
> {code:java}
> /* 154 */ public Object current() {
> /* 155 */ final Object[] current = (Object[]) inputEnumerator.current();
> /* 156 */ final java.math.BigDecimal inp1_ = current[1] == null ? 
> (java.math.BigDecimal) null : 
> org.apache.calcite.runtime.SqlFunctions.toBigDecimal(current[1]);
> /* 157 */ final int inp2_ = 
> org.apache.calcite.runtime.SqlFunctions.toInt(current[2]);
> /* 158 */ final boolean v0 = inp2_ != 0;
> /* 159 */ final java.math.BigDecimal v2 = 
> org.apache.calcite.runtime.SqlFunctions.divide(inp1_, new 
> java.math.BigDecimal(
> /* 160 */ inp2_));
> /* 161 */ return new Object[] {
> /* 162 */ inp1_ != null && v0 && 
> org.apache.calcite.runtime.SqlFunctions.lt(v2, 
> $L4J$C$new_java_math_BigDecimal_77_) && (inp1_ != null && v0 && 
> org.apache.calcite.runtime.SqlFunctions.gt(v2, 
> $L4J$C$new_java_math_BigDecimal_0_)) ? 99 : 88,
> /* 163 */ current[2]};
> /* 164 */ }
> {code}
>  
>  
> And by tracing the code generation, I found that in 
> org.apache.calcite.adapter.enumerable.RexImpTable#implementNullSemantics
> {code:java}
> case FALSE:
>   // v0 != null && v1 != null && f(v0, v1)
>   for (Ord operand : Ord.zip(call.getOperands())) {
> if (translator.isNullable(operand.e)) {
>   list.add(
>   translator.translate(
>   operand.e, NullAs.IS_NOT_NULL));
>   translator = translator.setNullable(operand.e, false);
> }
>   }
>   list.add(implementCall(translator, call, implementor, nullAs));
>   return Expressions.foldAnd(list);
> {code}
> operand "SUM(COL_A) / nullif(SUM(0),0)" is set as nullable=false and caused 
> this error.
> My understanding is that since operands are translated as NullAs.IS_NOT_NULL, 
> it's then safe to evaluate them as nullable=false. But in my case, the 
> evaluation with nullable=false will make "nullif(SUM(0), 0)" to be translated 
> to 0, and cause problem on the division. 
> After comment out the below line
> {code:java}
> translator = translator.setNullable(operand.e, false);{code}
> the query will work. May I ask if it is ok to comment out that line? cause 
> that looks like solving my problem temporarily. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3173) RexNode Code Generation Problem

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3173:

Fix Version/s: (was: 1.22.0)

> RexNode Code Generation Problem
> ---
>
> Key: CALCITE-3173
> URL: https://issues.apache.org/jira/browse/CALCITE-3173
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Feng Zhu
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
> Attachments: code.png, codegen.png
>
>
> *Abstract:* Both RexImpTable and BlockBuilder have codegen issues, but it is 
> interesting that they can work together well for most cases.
>     We can illustrate the problem with a simple test case in JdbcTest, in 
> which the "commission" column is nullable.
> {code:java}
> @Test public void testNullSafeCheck() {
>     CalciteAssert.hr()
>   .query("select \"commission\" + 10 as s from \"hr\".\"emps\"")
>   .returns("S=1010\n"
>+ "S=510\n"
>+ "S=null\n"
>+ "S=260\n");
> }
> {code}
>     This test case can pass as the BlockBuilder is in default optimization 
> mode. However, when we set it into un-optimization mode in _EnumerableCalc_, 
> this test will fail due to NPE. The following picture demonstrates their 
> differences.
> !code.png!
> *1.RexImpTable generates unsafe code*
>      Before translating the RexCall (_*Add*_), RexImpTable firstly translate 
> its operands with (nullAs=*IS_NULL*) [1] to detect whether it is null (i.e., 
> {color:#ff}_inp4_unboxed_{color}). Then RexImpTable sets this operand's 
> nullable in RexToLixTranslator as {color:#FF}false{color} [2]. After 
> that, the operand will be translated again with *NOT_POSSIBLE* [3] to get the 
> value (i.e., {color:#ff}_inp4_0_unboxed_{color}). In the end, the RexCall 
> is implemented by NotNullImplementor.However, it is not safe to conduct 
> operations like unboxing in the second translation phase. 
>  *2.BlockBuiler optimization's semantic issue buries NPE*
>      BlockBuilder.optimize() changes the code semantic in this case. For 
> conditional-like clauses (if...else, ?:, etc), InlineVariableVisitor will 
> wrongly make variables inlined.
>     In general, they can work together for most cases. However, when some 
> special branch is triggered by query, the problem will be exposed. For 
> example, the NewExpression (_new java.math.BigDecimal_) in CALCITE-3143 
> breaks the inline optimization phase.
>  
> *How to fix?*
>      I have digged into this problem a couple of days and tried many 
> approaches to fix it. But in this progress, I found the limitation in current 
> implementation.   The whole recursive framework essentially conducts a 
> sequential codegen beheavior, and may visit a RexNode again and again with 
> different NullAs settings.
>     Due to the limitation, it is difficult to implement null-safe codegen 
> semantics with branching logic. We can also find that there are many branches 
> for special cases in current implementation. Even we can handle potential 
> issues every time, the logic will becomes more and more complex  and 
> unfriendly for maintenance.   
>  
> Therefore, I propose to re-consider this part, including several initial 
> points.
>  (1) {color:#ff}_Visitor Pattern_{color} (RexVisitor). 
> Theoretically, RexNode can be translated into Expression by visiting the node 
> only once. We can implement RexVisitor rather than current recursive 
> translation.
>  (2)The {color:#ff}Result{color} consists of three items (code: 
> BuilderStatement, isNull: ParameterExpression, value: Expression).So it is 
> easy to decide how  to implement a RexNode according to its children.
>  
> Please correct me if I make something wrong. Look forward to suggestions!
>  
> [1][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1062]
> [2][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1064]
> [3][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1113]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3173) RexNode Code Generation Problem

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040645#comment-17040645
 ] 

Danny Chen commented on CALCITE-3173:
-

Seems an umbrella issue and not possible to be resolved in release 1.22. So i 
remove the resolve version tag.

> RexNode Code Generation Problem
> ---
>
> Key: CALCITE-3173
> URL: https://issues.apache.org/jira/browse/CALCITE-3173
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Feng Zhu
>Assignee: Feng Zhu
>Priority: Critical
>  Labels: pull-request-available
> Attachments: code.png, codegen.png
>
>
> *Abstract:* Both RexImpTable and BlockBuilder have codegen issues, but it is 
> interesting that they can work together well for most cases.
>     We can illustrate the problem with a simple test case in JdbcTest, in 
> which the "commission" column is nullable.
> {code:java}
> @Test public void testNullSafeCheck() {
>     CalciteAssert.hr()
>   .query("select \"commission\" + 10 as s from \"hr\".\"emps\"")
>   .returns("S=1010\n"
>+ "S=510\n"
>+ "S=null\n"
>+ "S=260\n");
> }
> {code}
>     This test case can pass as the BlockBuilder is in default optimization 
> mode. However, when we set it into un-optimization mode in _EnumerableCalc_, 
> this test will fail due to NPE. The following picture demonstrates their 
> differences.
> !code.png!
> *1.RexImpTable generates unsafe code*
>      Before translating the RexCall (_*Add*_), RexImpTable firstly translate 
> its operands with (nullAs=*IS_NULL*) [1] to detect whether it is null (i.e., 
> {color:#ff}_inp4_unboxed_{color}). Then RexImpTable sets this operand's 
> nullable in RexToLixTranslator as {color:#FF}false{color} [2]. After 
> that, the operand will be translated again with *NOT_POSSIBLE* [3] to get the 
> value (i.e., {color:#ff}_inp4_0_unboxed_{color}). In the end, the RexCall 
> is implemented by NotNullImplementor.However, it is not safe to conduct 
> operations like unboxing in the second translation phase. 
>  *2.BlockBuiler optimization's semantic issue buries NPE*
>      BlockBuilder.optimize() changes the code semantic in this case. For 
> conditional-like clauses (if...else, ?:, etc), InlineVariableVisitor will 
> wrongly make variables inlined.
>     In general, they can work together for most cases. However, when some 
> special branch is triggered by query, the problem will be exposed. For 
> example, the NewExpression (_new java.math.BigDecimal_) in CALCITE-3143 
> breaks the inline optimization phase.
>  
> *How to fix?*
>      I have digged into this problem a couple of days and tried many 
> approaches to fix it. But in this progress, I found the limitation in current 
> implementation.   The whole recursive framework essentially conducts a 
> sequential codegen beheavior, and may visit a RexNode again and again with 
> different NullAs settings.
>     Due to the limitation, it is difficult to implement null-safe codegen 
> semantics with branching logic. We can also find that there are many branches 
> for special cases in current implementation. Even we can handle potential 
> issues every time, the logic will becomes more and more complex  and 
> unfriendly for maintenance.   
>  
> Therefore, I propose to re-consider this part, including several initial 
> points.
>  (1) {color:#ff}_Visitor Pattern_{color} (RexVisitor). 
> Theoretically, RexNode can be translated into Expression by visiting the node 
> only once. We can implement RexVisitor rather than current recursive 
> translation.
>  (2)The {color:#ff}Result{color} consists of three items (code: 
> BuilderStatement, isNull: ParameterExpression, value: Expression).So it is 
> easy to decide how  to implement a RexNode according to its children.
>  
> Please correct me if I make something wrong. Look forward to suggestions!
>  
> [1][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1062]
> [2][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1064]
> [3][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1113]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3737) HOP Table-valued Function

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040643#comment-17040643
 ] 

Danny Chen commented on CALCITE-3737:
-

Is this issue going to be resolved in release 1.22 ?

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3800) FileReaderTest#testFileReaderUrlNoPath() timeout for AppVeyor test

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3800:

Fix Version/s: (was: 1.22.0)

> FileReaderTest#testFileReaderUrlNoPath() timeout for AppVeyor test
> --
>
> Key: CALCITE-3800
> URL: https://issues.apache.org/jira/browse/CALCITE-3800
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>
> The timeout is annoying, so i would disable it until we find a solution ~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040639#comment-17040639
 ] 

Danny Chen commented on CALCITE-3716:
-

Is this issue planning to be resolved in release 1.22 ?

> ResultSetMetaData.getTableName should return empty string, not null, when 
> column does not map to a table
> 
>
> Key: CALCITE-3716
> URL: https://issues.apache.org/jira/browse/CALCITE-3716
> Project: Calcite
>  Issue Type: Bug
>  Components: jdbc-driver
>Affects Versions: 1.21.0
>Reporter: Julian Hyde
>Assignee: Jin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Per the [JDBC 
> spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-],
>  {{ResultSetMetaData.getTableName}} should return empty string, not null, 
> when column does not map to a table. Similarly getCatalogName, getSchemaName, 
> getColumnName.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3802) Calcite Elasticsearch adapter should encode the URI before send the request

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040636#comment-17040636
 ] 

Danny Chen commented on CALCITE-3802:
-

Is this issue planning to be included in 1.22 ?

> Calcite Elasticsearch adapter should encode the URI before send the request
> ---
>
> Key: CALCITE-3802
> URL: https://issues.apache.org/jira/browse/CALCITE-3802
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Affects Versions: 1.21.0
>Reporter: jerryleooo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
> Attachments: image-2020-02-17-16-09-20-794.png
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> In 
> [https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchTransport.java#L121]
>  when the indexName has some special characters, the request will fail for 
> Elasticsearch server returning HTTP 400 error.
>  
> {code:java}
> val connection = DriverManager.getConnection("jdbc:calcite:") 
> val calciteConnection = connection.asInstanceOf[CalciteConnection] 
> val rootSchema = calciteConnection.getRootSchema()
> val esProperties = new util.HashMap[String, AnyRef]()
> esProperties.put("coordinates", "{'elasticsearch url': 80}") 
> rootSchema.add("es", new ElasticsearchSchemaFactory().create(rootSchema, 
> "es", esProperties))
> {code}
> !image-2020-02-17-16-09-20-794.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3541) Avoid transformations to Enumerable nodes for custom SqlOperators

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040634#comment-17040634
 ] 

Danny Chen commented on CALCITE-3541:
-

Is this issue planning to be included in 1.22 ?

> Avoid transformations to Enumerable nodes for custom SqlOperators
> -
>
> Key: CALCITE-3541
> URL: https://issues.apache.org/jira/browse/CALCITE-3541
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
> Fix For: 1.22.0
>
>
> Most Enumerable converter rules apply a transformation as soon as the 
> {{RelNode}} class and its convention match those defined by the rule. 
> However, there are use-cases that we would like to restrict the matches even 
> further to avoid generating unimplementable plans that will fail at runtime. 
> The most prominent example comes from extending the standard operator set 
> with new {{SqlOperator}} s that appear in filters and projections as part of 
> a row expression ({{RexNode}}). If we use the default instance of the 
> {{EnumerableCalcRule}} we might end-up with a plan that will fail at runtime 
> since the new operator is not handled by the Enumerable convention. Most 
> likely there is a {{RelNode}} in another convention that can handle this new 
> operator. 
> We could avoid such undesirable transformations by allowing instantiations of 
> the Enumerable converter rules with user-defined predicates. This also means 
> adding public constructors to the rules that currently they do not have one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3805) Add a new method to control the agg input prune with explicit flag

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3805:

Fix Version/s: (was: 1.22.0)

> Add a new method to control the agg input prune with explicit flag
> --
>
> Key: CALCITE-3805
> URL: https://issues.apache.org/jira/browse/CALCITE-3805
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>
> This feature is introduced by CALCITE-3763, which is cool for normal group 
> aggregations.
> But in Flink, we have window group aggregation, we invoke the normal 
> aggregate first then construct our LogicalWindowAggregate, the window may 
> have some attributes that reference the pruned columns.
> I though about how i can control the prune flexibility, but this behavior is 
> configured by the whole RelBuilder.Config, what i what is only forbidden this 
> behavior when i construct the window aggregate, i still want this feature for 
> normal aggregations.
> So, i propose to add a new method:
> {code:java}
> RelBuilder aggregate(
>   GroupKey groupKey,
>   Iterable aggCalls,
>   boolean pruneInputOfAggregate)
> {code}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CALCITE-3771) Support of TRIM function for SPARK dialect and improvement in HIVE Dialect

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved CALCITE-3771.
-
Fix Version/s: 1.22.0
 Assignee: Danny Chen
   Resolution: Fixed

Fixed in 
[5fa4160|https://github.com/apache/calcite/commit/5fa41609cb0fe310a0a11d86319d861423850a36],
 thanks for your PR [~dhirenda.gautam] !

> Support of TRIM function for SPARK dialect and improvement in HIVE Dialect
> --
>
> Key: CALCITE-3771
> URL: https://issues.apache.org/jira/browse/CALCITE-3771
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Dhirenda Gautam
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In current Calcite implementation for query : SELECT TRIM('ABC') for SPARK 
> Dialect it gets translated into SELECT TRIM(BOTH ' ' FROM 'ABC') .
> But the proper query for SPARK is :: SELECT TRIM('ABC')
> Unparse logic for the trim has been handled in Spark dialect to convert the 
> source Trim query into valid SPARK query.
>  
> Also,In HIVE/SPARK dialect TRIM with two operand is not supported
> Eg: SELECT TRIM(BOTH 'a' from 'ABC') So its equivalent is REGEXP_REPLACE 
> which is handle in unparseTrim function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread Danny Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3807:

Fix Version/s: 1.22.0

> checkForSatisfiedConverters() is unnecessary 
> -
>
> Key: CALCITE-3807
> URL: https://issues.apache.org/jira/browse/CALCITE-3807
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When VolcanoPlanner registers an abstract converter, it adds the converter 
> into set.abstractConverters list, then calls checkSatisfiedConverter() to see 
> if any converter is satisfied and can be remove from the list. But for every 
> abstract converter, it always satisfies itself (changeTraitsUsingConverters() 
> returns itself). Basically the converter would be removed from the list right 
> after it's added. So this check is completely unnecessary and it slows down 
> the planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3679) Allow lambda expressions in SQL queries

2020-02-19 Thread Ritesh (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040591#comment-17040591
 ] 

Ritesh commented on CALCITE-3679:
-

It would be really nice if we can rollout this feature in 1.22 release?

> Allow lambda expressions in SQL queries
> ---
>
> Key: CALCITE-3679
> URL: https://issues.apache.org/jira/browse/CALCITE-3679
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Ritesh
>Assignee: Ritesh
>Priority: Major
>  Labels: pull-request-available
> Attachments: [CALCITE-3679]_Basic_implementation.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> [https://teradata.github.io/presto/docs/0.167-t/functions/lambda.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3805) Add a new method to control the agg input prune with explicit flag

2020-02-19 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040574#comment-17040574
 ] 

Danny Chen commented on CALCITE-3805:
-

The RelBuilder.transform(UnaryOperator) is good at setting the config 
options, but it new a fresh RelBuilder and the stack all lost. In order to use 
that, i have to push in the built node from the original builder.

> Add a new method to control the agg input prune with explicit flag
> --
>
> Key: CALCITE-3805
> URL: https://issues.apache.org/jira/browse/CALCITE-3805
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
> Fix For: 1.22.0
>
>
> This feature is introduced by CALCITE-3763, which is cool for normal group 
> aggregations.
> But in Flink, we have window group aggregation, we invoke the normal 
> aggregate first then construct our LogicalWindowAggregate, the window may 
> have some attributes that reference the pruned columns.
> I though about how i can control the prune flexibility, but this behavior is 
> configured by the whole RelBuilder.Config, what i what is only forbidden this 
> behavior when i construct the window aggregate, i still want this feature for 
> normal aggregations.
> So, i propose to add a new method:
> {code:java}
> RelBuilder aggregate(
>   GroupKey groupKey,
>   Iterable aggCalls,
>   boolean pruneInputOfAggregate)
> {code}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3807:

Labels: pull-request-available  (was: )

> checkForSatisfiedConverters() is unnecessary 
> -
>
> Key: CALCITE-3807
> URL: https://issues.apache.org/jira/browse/CALCITE-3807
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Priority: Major
>  Labels: pull-request-available
>
> When VolcanoPlanner registers an abstract converter, it adds the converter 
> into set.abstractConverters list, then calls checkSatisfiedConverter() to see 
> if any converter is satisfied and can be remove from the list. But for every 
> abstract converter, it always satisfies itself (changeTraitsUsingConverters() 
> returns itself). Basically the converter would be removed from the list right 
> after it's added. So this check is completely unnecessary and it slows down 
> the planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread Xiening Dai (Jira)

Xiening Dai created CALCITE-3807:


 Summary: checkForSatisfiedConverters() is unnecessary 
 Key: CALCITE-3807
 URL: https://issues.apache.org/jira/browse/CALCITE-3807
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


When VolcanoPlanner registers an abstract converter, it adds the converter into 
set.abstractConverters list, then calls checkSatisfiedConverter() to see if any 
converter is satisfied and can be remove from the list. But for every abstract 
converter, it always satisfies itself (changeTraitsUsingConverters() returns 
itself). Basically the converter would be removed from the list right after 
it's added. So this check is completely unnecessary and it slows down the 
planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3805) Add a new method to control the agg input prune with explicit flag

2020-02-19 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040490#comment-17040490
 ] 

Julian Hyde commented on CALCITE-3805:
--

Did you notice that I added {{RelBuilder.transform(UnaryOperator)}} 
recently? It makes it easy to create a {{RelBuilder}} with a slightly different 
config.

> Add a new method to control the agg input prune with explicit flag
> --
>
> Key: CALCITE-3805
> URL: https://issues.apache.org/jira/browse/CALCITE-3805
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
> Fix For: 1.22.0
>
>
> This feature is introduced by CALCITE-3763, which is cool for normal group 
> aggregations.
> But in Flink, we have window group aggregation, we invoke the normal 
> aggregate first then construct our LogicalWindowAggregate, the window may 
> have some attributes that reference the pruned columns.
> I though about how i can control the prune flexibility, but this behavior is 
> configured by the whole RelBuilder.Config, what i what is only forbidden this 
> behavior when i construct the window aggregate, i still want this feature for 
> normal aggregations.
> So, i propose to add a new method:
> {code:java}
> RelBuilder aggregate(
>   GroupKey groupKey,
>   Iterable aggCalls,
>   boolean pruneInputOfAggregate)
> {code}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-2348) Handling non-deterministic operator in rules

2020-02-19 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040298#comment-17040298
 ] 

Julian Hyde commented on CALCITE-2348:
--

I added some suggestions to CALCITE-3760 - namely to ensure that 
non-deterministic function calls only occur at the top of an expression in a 
Project - that I think would be useful here.

For the record, I think this PR is pretty good. It could be re-worked to be 
consistent with the 'non-deterministic always on top' rule.

I'm still not sure whether non-deterministic functions can be pushed down. But 
I am inclined to believe [~godfreyhe] that they can.

> Handling non-deterministic operator in rules
> 
>
> Key: CALCITE-2348
> URL: https://issues.apache.org/jira/browse/CALCITE-2348
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.17.0
>Reporter: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> Currently,  rules do not handle non-deterministic operator,
> e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter 
> through an aggregate.
> {code:java}
> // rand_substr is a non-deterministic udf
> @Test public void testPushFilterPastAggWithNondeterministicFilter() {
>   final String sql = "select ename, empno, c from\n"
>   + " (select ename, empno, count(*) as c from emp group by ename, empno) 
> t\n"
>   + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10";
>   checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3760) Rewriting non-deterministic function can break query semantics

2020-02-19 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040289#comment-17040289
 ] 

Julian Hyde commented on CALCITE-3760:
--

I don't think this PR approaches the problem the right way:
* I don't yet see a need to add {{boolean isDeterministic()}} method to 
{{org.apache.calcite.schema.Function}}. We already have {{boolean 
SqlOperator.isDeterministic()}} and we have the annotation 
{{org.apache.calcite.linq4j.function.Deterministic}} to be applied to Java UDFs.
* Figuring out whether an expression is deterministic by tree-walking into the 
expression, as {{SqlUtil.isDeterminstic}} does, is the wrong approach. It will 
work on the original query, but the determinism will get broken during rewrite.
* I think the robust solution is to ensure that operators that are 
non-deterministic appear only as the topmost expression in a {{Project}}. That 
way they are computed only once, and any expression referencing them is using a 
field that has already been computed. Probably it would be up to 
{{SqlToRelConverter}} to carve up expressions so that non-deterministic 
operators are on top. And up to {{ProjectMergeRule}} and similar rules to 
ensure that they stay on top.
* The definition of 'deterministic' is ambiguous. The documentation should 
state explicitly whether functions that return the same result for the duration 
of a query (e.g. {{CURRENT_TIMESTAMP}}) or the same result for the current row 
(e.g. {{NEXT_VALUE}} of a sequence) are considered to be deterministic. Really 
we need deterministic to be an enum with 4 values, not a boolean. Maybe there's 
a fifth: a function that must be evaluated even if its result is not used.

> Rewriting non-deterministic function can break query semantics
> --
>
> Key: CALCITE-3760
> URL: https://issues.apache.org/jira/browse/CALCITE-3760
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Calcite rewrite some *SqlFunctions* during validation. But whether the 
> function is deterministic is not considered. For a non-deterministic 
> operator, the rewriting can break semantics. Additionally there's no 
> interface for user to specify the determinism for a UDF/UDAF. 
> Say I have non-deterministic UDF & UDAF and run sql like below
> {code:java}
> select coalesce(udf(col0), 100) from foo;
> select nullif(udaf(col0), 1024) from foo;{code}
> They will be rewritten as
> {code:java}
> select case when udf(col0) is not null then udf(col0) else 100 end
> from foo;
> select case when udaf(col0)=1024 then null udaf(col0)
> from foo{code}
> As we can see that non-deterministic UDF & UDAF are called multiple times 
> after written. Thus the condition in WHEN clause might NOT be held all the 
> time.
> We need to provide an interface for user to specify the determinism in 
> UDF/UDAF and consider whether a SqlNode is deterministic when rewriting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (CALCITE-3806) How to optimize repeated RelNode Structures ?

2020-02-19 Thread Julian Hyde (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde closed CALCITE-3806.

Resolution: Invalid

Closing as "invalid" because this is a question, not a bug or a feature 
request. It's a good question, but please ask it on the dev list.

> How to optimize repeated RelNode Structures ?
> -
>
> Key: CALCITE-3806
> URL: https://issues.apache.org/jira/browse/CALCITE-3806
> Project: Calcite
>  Issue Type: Wish
>  Components: core
>Reporter: anjali shrishrimal
>Priority: Minor
>
> Let's say input structure looks like this :
> {noformat}
> LogicalUnion(all=[true])
>   LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[>=($0, 7369)])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[>=($0, 7369)])
>   LogicalTableScan(table=[[scott, EMP]]){noformat}
>  
> In this case,
> {noformat}
>   LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[>=($0, 7369)])
>   LogicalTableScan(table=[[scott, EMP]]){noformat}
> is repeated. It is going to fetch same data twice.
> Can we save one fetch? Can we somehow tell 2nd input of union to make use of 
> union's 1st input. Is there any way to express that in plan?
>  
> Also,
> If the structure was like this :
> {noformat}
> LogicalUnion(all=[true])
>   LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[>=($0, 7369)])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalFilter(condition=[>=($0, 8000)])
>   LogicalTableScan(table=[[scott, EMP]]){noformat}
> Second part of union can perform filtering on fetched data of 1st part. (As 
> second's output is subset of first's output)
>  
> Does calcite provide such kind of optimizations ?
> If not, what are the challenges to do so?
>  
>  
> I would appreciate some comments on this. Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040257#comment-17040257
 ] 

Julian Hyde commented on CALCITE-3772:
--

Option (2) seems likely.

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Assignee: Feng Zhu
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Julian Hyde (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-3772:
-
Affects Version/s: (was: next)

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Assignee: Feng Zhu
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Feng Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040230#comment-17040230
 ] 

Feng Zhu commented on CALCITE-3772:
---

I think there are two general ways to fix this issue:

(1) Prevent to trim fields when we find correlated variable in the plan.

(2) Enable _*RelFieldTrimmer*_ to handle correlated variables when trimming 
fields. But it may bring more or less duplicate logic and data structurs (e.g., 
maintaining maps) in _*RelDecorrelator*_.

Hi, [~julianhyde], [~vlsi] , sorry to bother you. Since I don't know Calcite's 
initial design for correlated-subquery, could you give me some advices?

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0, next
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Assignee: Feng Zhu
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Feng Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040212#comment-17040212
 ] 

Feng Zhu commented on CALCITE-3772:
---

It's not safe to trim fields [1] when the Plan contains correlated variables.
*Before trim:*
{code:java}
LogicalProject(A=[$0], EXPR$1=[$SCALAR_QUERY({
LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
  LogicalProject($f0=[0])
LogicalFilter(condition=[<($1, $cor0.B)])
  LogicalTableScan(table=[[BUG, T1]])
})])
  LogicalFilter(condition=[>($4, 100)])
LogicalTableScan(table=[[BUG, T1]])
{code}
*After trim:* 
{code:java}
LogicalProject(A=[$0], EXPR$1=[$SCALAR_QUERY({
LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
  LogicalProject($f0=[0])
LogicalFilter(condition=[<($1, $cor0.B)])
  LogicalTableScan(table=[[BUG, T1]])
})])
  LogicalFilter(condition=[>($1, 100)])
LogicalProject(A=[$0], E=[$4])
  LogicalTableScan(table=[[BUG, T1]])
{code}
Consequently, when decoralating the plan, it produces the incorrect RelNode for 
_*{color:#FF}$cor0.B{color}*_:
{code:java}
LogicalAggregate(group=[{0}])
  LogicalProject(E=[$4])
LogicalFilter(condition=[>($4, 100)])
  LogicalTableScan(table=[[BUG, T1]])
{code}
[1]https://github.com/apache/calcite/blob/2ffd74abb665f4119ff30926f3944070d8a9d0ac/core/src/main/java/org/apache/calcite/prepare/Prepare.java#L281

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0, next
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Assignee: Feng Zhu
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CALCITE-3803) Enhance RexSimplify to simplify 'a>1 or (a<3 and b)' to 'a>1 or b' if column a is not nullable

2020-02-19 Thread Chunwei Lei (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunwei Lei resolved CALCITE-3803.
--
Fix Version/s: 1.22.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/2ffd74abb665f4119ff30926f3944070d8a9d0ac].
 Thank you for your review, [~kgyrtkirk] and [~hyuan] .

> Enhance RexSimplify to simplify 'a>1 or (a<3 and b)' to 'a>1 or b' if column 
> a is not nullable
> --
>
> Key: CALCITE-3803
> URL: https://issues.apache.org/jira/browse/CALCITE-3803
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For {{a>1 or (a<3 and b)}}, with short-circuit, we know {{a<=1}} if {{a>1}} 
> is false when column a is not nullable. Then {{(a<3 and b) can be simplified 
> to b}}. Thus, {{a>1 or (a<3 and b)is simplified to a>1 or b.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3806) How to optimize repeated RelNode Structures ?

2020-02-19 Thread anjali shrishrimal (Jira)

anjali shrishrimal created CALCITE-3806:
---

 Summary: How to optimize repeated RelNode Structures ?
 Key: CALCITE-3806
 URL: https://issues.apache.org/jira/browse/CALCITE-3806
 Project: Calcite
  Issue Type: Wish
  Components: core
Reporter: anjali shrishrimal


Let's say input structure looks like this :
{noformat}
LogicalUnion(all=[true])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}
 

In this case,
{noformat}
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}
is repeated. It is going to fetch same data twice.

Can we save one fetch? Can we somehow tell 2nd input of union to make use of 
union's 1st input. Is there any way to express that in plan?

 

Also,
If the structure was like this :
{noformat}
LogicalUnion(all=[true])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 8000)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}

Second part of union can perform filtering on fetched data of 1st part. (As 
second's output is subset of first's output)

 

Does calcite provide such kind of optimizations ?

If not, what are the challenges to do so?

 

 

I would appreciate some comments on this. Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Feng Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039846#comment-17039846
 ] 

Feng Zhu commented on CALCITE-3772:
---

If we remove the predicate (e>100), the query runs well. There may be some 
issues when decorrelating the correlated-subquery. The plan generated is not 
correct.

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0, next
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (CALCITE-3772) Query returning bad results

2020-02-19 Thread Feng Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu reassigned CALCITE-3772:
-

Assignee: Feng Zhu

> Query returning bad results
> ---
>
> Key: CALCITE-3772
> URL: https://issues.apache.org/jira/browse/CALCITE-3772
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.21.0, next
> Environment: [^T1.csv]
>Reporter: Jacob Roldan
>Assignee: Feng Zhu
>Priority: Critical
> Attachments: T1.csv
>
>
> Inspired in tests of sqllite, I have a query that it is returning bad values.
> I've tested using CsvTest in 1.21.0 and the last master
>  
> {code:sql}
> SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b FROM t1
> WHERE (e>100)
> order by a
> {code}
>  
>  and the result is:
> {code:java}
>  104, 1
>  107, 2
>  111, 2
>  115, 3
>  121, 4 {code}
>  Testing the same query in mysql, derby and postgres, the result is:
> {code:java}
> 104 0 
> 107 1 
> 111 2 
> 115 3 
> 121 4{code}
>  I've attached the csv file I've put in 
> ??calcite/example/csv/src/test/resources/bug/T1.csv??
> and the query test in CsvTest.java
> {code:java}
> @Test public void testQuery() throws SQLException {
>  sql("bug", "SELECT a, (SELECT count(*) FROM t1 AS x WHERE x.b where e>100 order by a").ok();
> }
> {code}
> The explain plan:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[IS NULL($t3)], 
> expr#5=[0:BIGINT], expr#6=[CASE($t4, $t5, $t3)], A=[$t0], EXPR$1=[$t6])
>   EnumerableHashJoin(condition=[=($1, $2)], joinType=[left])
> EnumerableCalc(expr#0..4=[{inputs}], A=[$t0], E=[$t4])
>   EnumerableSort(sort0=[$0], dir0=[ASC])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{1}], EXPR$0=[COUNT()])
>   EnumerableNestedLoopJoin(condition=[<($0, $1)], joinType=[inner])
> EnumerableCalc(expr#0..4=[{inputs}], B=[$t1])
>   EnumerableInterpreter
> BindableTableScan(table=[[BUG, T1]])
> EnumerableAggregate(group=[{4}])
>   EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t4, 
> $t5)], proj#0..4=[{exprs}], $condition=[$t6])
> EnumerableInterpreter
>   BindableTableScan(table=[[BUG, T1]])
> {code}
> The T1.csv is very simple:
> {code:java}
> A:int,B:int,C:int,D:int,E:int
> 104,100,102,101,103
> 107,105,106,108,109
> 111,112,113,114,110
> 115,118,119,116,117
> 121,124,123,122,120
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

56 matches

Mail list logo