[jira] [Created] (CALCITE-4222) Clarify if null elements are allowed in RelTraitSet

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4222:
--

 Summary: Clarify if null elements are allowed in RelTraitSet
 Key: CALCITE-4222
 URL: https://issues.apache.org/jira/browse/CALCITE-4222
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


Currently, one-half of the methods in {{RelTraitSet}} fails with NPE, and 
another half does have some null-checks. Then there are {{getDefault()}} and 
{{isDefault()}}

AFAIK the only way to create a traitset with null element currently is to 
implement {{org.apache.calcite.plan.RelTraitDef#getDefault}} to return {{null}} 
(which never happens in Calcite code).
However, such a traitset would likely fail with NPE in 
{{org.apache.calcite.plan.RelTraitSet#satisfies}} quite fast.

{{RelCollationTraitDef}} and {{RelDistributionTraitDef}} have non-null default 
value which might interfere with collation and distribution metadata. For 
instance, {{distribution}} metadata might return null, and then it would fail 
with NPE when replacing distribution trait.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4221) Update stale integration tests in Druid adapter

2020-09-02 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-4221:


 Summary: Update stale integration tests in Druid adapter
 Key: CALCITE-4221
 URL: https://issues.apache.org/jira/browse/CALCITE-4221
 Project: Calcite
  Issue Type: Task
  Components: druid-adapter
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Fix For: 1.26.0


The integration tests in Druid adapter (DruidAdapterIT, DruidAdapter2IT) have 
not run for a while since various things seem to be broken in 
[calcite-test-dataset|https://github.com/vlsi/calcite-test-dataset|http://example.com]
 that is a prerequisite for running those tests. 

The goal of this issue is to run the integration tests against a new dockerized 
[Druid project|https://github.com/zabetak/calcite-druid-dataset] updating the 
plans and properly skipping the failing tests.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189874#comment-17189874
 ] 

Danny Chen commented on CALCITE-4215:
-

I agree

"This is interesting to know because numeric nulls should be relatively easy to 
detect with IDE."

But we can not rely on the IDE to keep good code quality. The code itself 
should be defensive somehow.

I also like the idea

"RelMetadataQuery users don't need to deal with null distributions."

It is much better if we can return a default value when we can not infer the 
metadata, especially if with such a default value, the engine have a 
deterministic behavior (it knows how to go ahead with it).

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189872#comment-17189872
 ] 

Vladimir Sitnikov commented on CALCITE-4215:


{quote}In order to avoid the NPEs, we return -1 when we can not infer the 
specific row count number.{quote}
This is interesting to know because numeric nulls should be relatively easy to 
detect with IDE.
For instance {{if (rowCount != 0)}} would result in {{Unboxing of 'rowCount' 
may produce 'NullPointerException'}} warning.

I think {{null}} for {{Double}} and {{Boolean}} metatada looks reasonable. It 


On the other hand, all the current uses of 
{{org.apache.calcite.rel.metadata.RelMetadataQuery#distribution}} would result 
in NPE if we allow {{distribution}} to return null.

That is why I'm inlined that we should use something like the following:

{code:java}
RelDistribution distribution = distributionHandler.distribution(rel, 
this);
if (distribution == null) {
  return RelDistributions.ANY;
}
return distribution;
{code}

In other words, allow handlers to return {{null}}, however, convert it to the 
default value, so {{RelMetadataQuery}} users don't need to deal with null 
distributions.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4203) RelMdUniqueKeys should not return empty when meeting Intersect and Minus if its input has unique keys

2020-09-02 Thread Chunwei Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunwei Lei resolved CALCITE-4203.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/103430bcc76ed1a7cbab6861ecdd39814cf4ef83.

> RelMdUniqueKeys should not return empty when meeting Intersect and Minus if 
> its input has unique keys
> -
>
> Key: CALCITE-4203
> URL: https://issues.apache.org/jira/browse/CALCITE-4203
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, we get an empty unique key for Intersect and Minus if its all is 
> true. 
>  
> {code:java}
> // code placeholder
> public Set getUniqueKeys(SetOp rel, RelMetadataQuery mq,
> boolean ignoreNulls) {
>   if (!rel.all) {
> return ImmutableSet.of(
> ImmutableBitSet.range(rel.getRowType().getFieldCount()));
>   }
>   return ImmutableSet.of();
> }
> {code}
> However, from the semantic of Intersect and Minus, we can get their unique 
> keys from its input even if its all is true. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4180) Support for Elasticsearch basic authentication

2020-09-02 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved CALCITE-4180.
-
Fix Version/s: 1.26.0
 Assignee: Danny Chen
   Resolution: Fixed

Fixed in 
[e9f9261|https://github.com/apache/calcite/commit/e9f926165c035ff62942dea318c5820ed69c1124],
 thanks for the PR, [~geiguanbing] !

> Support for Elasticsearch basic authentication
> --
>
> Key: CALCITE-4180
> URL: https://issues.apache.org/jira/browse/CALCITE-4180
> Project: Calcite
>  Issue Type: Bug
>  Components: elasticsearch-adapter
>Reporter: fa
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> Properties config = new Properties();
> config.put("lex", "JAVA");
> String sql = "select * from index";
> try (Connection con = 
> DriverManager.getConnection("jdbc:calcite:model=src/main/resources/es.json", 
> config)) {
> try (Statement stmt = con.createStatement()) {
> try (ResultSet rs = stmt.executeQuery(sql)) {
> printRs(rs);
> }
> }
> }
> {code}
>  
>  
> es.json
> {code:java}
> {
>   "version": "1.0",
>   "defaultSchema": "elasticsearch",
>   "schemas": [
> {
>   "type": "custom",
>   "name": "elasticsearch",
>   "factory": 
> "org.apache.calcite.adapter.elasticsearch.ElasticsearchSchemaFactory",
>   "operand": {
> "coordinates": "{'192.168.133.104': 9200}",
> "jdbcUser": "elastic",
> "jdbcPassword": "elastic"
>   }
> }
>   ]
> }
> {code}
> and throw Exception
> {code:java}
> {
> "error":{
> "root_cause":[
> {
> "type":"security_exception",
> "reason":"missing authentication token for REST request 
> [/_alias]",
> "header":{
> "WWW-Authenticate":"Basic realm="security" 
> charset="UTF-8""
> }
> }
> ],
> "type":"security_exception",
> "reason":"missing authentication token for REST request [/_alias]",
> "header":{
> "WWW-Authenticate":"Basic realm="security" charset="UTF-8""
> }
> },
> "status":401
> }
> {code}
> Where to set Elasticsearch username/password?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Liya Fan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189838#comment-17189838
 ] 

Liya Fan commented on CALCITE-4197:
---

[~julianhyde][~danny0405] Thanks for your feedback.

Your comments sound mostly reasonable to me. 

I want to discuss this one: "Could we remove the arc labels? (0, 1)"

IMO, the arc labels contain important information. For example, given a Join, 
it helps us easily identify which input is the left side, and which one is the 
right side. 

Please note that, GraphViz has an algorithm to make the graph representations 
look compact, so it may paint the left input on the right and vice versa. 

So can we keep the arc labels?


> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4206) RelDecorrelator outputs wrong plan for correlate sort with fetch limit

2020-09-02 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189821#comment-17189821
 ] 

Danny Chen commented on CALCITE-4206:
-

I'm going to merge the PR if there are no more comments in 24 hours.

> RelDecorrelator outputs wrong plan for correlate sort with fetch limit
> --
>
> Key: CALCITE-4206
> URL: https://issues.apache.org/jira/browse/CALCITE-4206
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For query
> {code:sql}
> SELECT deptno, ename
>   FROM
> (SELECT DISTINCT deptno FROM emp) t1,
>   LATERAL (
> SELECT ename, sal
> FROM emp
> WHERE deptno = t1.deptno
> ORDER BY sal
> DESC LIMIT 3
>   )
> {code}
> The current plan after decorrelation is
> {code:xml}
> LogicalProject(DEPTNO=[$0], ENAME=[$1])
>   LogicalJoin(condition=[=($0, $3)], joinType=[inner])
> LogicalAggregate(group=[{0}])
>   LogicalProject(DEPTNO=[$7])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalSort(sort0=[$1], dir0=[DESC], fetch=[3])
>   LogicalProject(ENAME=[$1], SAL=[$5], DEPTNO=[$7])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> which is wrong, because the partition sort(on sal) changes to global.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189806#comment-17189806
 ] 

Danny Chen commented on CALCITE-4197:
-

+1 to "LogicalProject JOB=$2 SAL=$5", and we can add newline after each 
attribute item.

> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189798#comment-17189798
 ] 

Danny Chen commented on CALCITE-4215:
-

I want to share the case of row count from the Flink side, because it can 
return null, NPEs often happens from time to time. In order to avoid the NPEs, 
we return -1 when we can not infer the specific row count number.

Thus, the codes querying it needs to identify if the row count is -1 every time 
we need to handle it.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189794#comment-17189794
 ] 

Danny Chen commented on CALCITE-4215:
-

Just like you said

"At the end of the day, if the collation is not known, it is unlikely we could 
figure it out. Then it is indistinguishable from *table is completely 
unsorted*".

It doesn't make much sense to distinguish between known and empty, if we return 
nulls, all the codes that query these metadata need to identify the difference, 
which complicates logic but gains little. From this point of view, adding 
enumerations doesn't really resolve the problem.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhishui updated CALCITE-4211:
-
Issue Type: New Feature  (was: Bug)
  Priority: Major  (was: Critical)

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: New Feature
>Reporter: zhishui
>Assignee: zhishui
>Priority: Major
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189754#comment-17189754
 ] 

zhishui commented on CALCITE-4211:
--

yeah, thanks for your idea, I will fix it as a feature.

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Assignee: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189748#comment-17189748
 ] 

Vladimir Sitnikov commented on CALCITE-4215:


In other words: I understand how "unknown collation" differs from "known empty 
collation", however, I think that distinction won't improve our planning, and 
features like "allow collation to be null" would make the code significantly 
harder to maintain.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189745#comment-17189745
 ] 

Vladimir Sitnikov commented on CALCITE-4215:


[~julianhyde], I've prepared a PR that makes {{Statistic}} methods to {{return 
null}} by default (especially for 
{{org.apache.calcite.schema.Statistics#UNKNOWN}} instance), and it rippled in 
quite a few of NPEs.

Some observations:
 1) RelTrait does not support null values. Both collation and distribution 
traits exist in core, and they fail with NPE when rules attempt to replace 
"collation trait with null". I don't think we want to support {{null}} trait 
values yet, do we?

2) There is code that treats 
{{org.apache.calcite.sql.validate.SqlMonotonicity}} to be an exhaustive enum. 
This change 
[https://github.com/apache/calcite/pull/2136/commits/bce22a73f3f244f82facd495ecac9c91c9bb0bb9]
 was hard to figure out. Without such a change, OVER expressions are converted 
with an extra order column.

3) There are non-trivial {{collation}} usages: 
[https://github.com/apache/calcite/pull/2136/commits/cb2e93ba8acb510303ed36e6aec1dbb8ce68bea1]

Frankly speaking, I don't think there's much difference between "I don't know 
if the table has any collation" and "I know the table has no collation". At the 
end of the day, if the collation is not known, it is unlikely we could figure 
it out. Then it is indistinguishable from "table is completely unsorted".

Tracking {{null}} value for {{SqlMonotonicity}} would be hard for both Calcite 
core (unless we go with Kotlin or checkerframework) and Calcite users.
 What if we add {{SqlMonotonicity.UNKNOWN}} enum, {{RelDistributions.UNKNOWN}}?

Of course, adding explicit {{@nullable}} annotations to Calcite would help to 
detect such NPEs, however, it won't help us to detect cases like {{if 
(monotonicity != NOT_MONOTONIC)}}.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4220) In SqlToRelConverter, use RelBuilder for creating Aggregate

2020-09-02 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde reassigned CALCITE-4220:


Assignee: Julian Hyde

> In SqlToRelConverter, use RelBuilder for creating Aggregate
> ---
>
> Key: CALCITE-4220
> URL: https://issues.apache.org/jira/browse/CALCITE-4220
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>
> In {{SqlToRelConverter}} use {{RelBuilder}} for creating {{Aggregate}}. 
> Currently we call LogicalAggregate.create directly, and that misses some 
> optimizations.
> Add config option {{RelBuilder.Config.skipAggOnUniqueKey()}} to disable the 
> optimization that we don't create an {{Aggregate}} if the input is already 
> unique on the GROUP BY key. Without it, some tests in 
> {{SqlToRelConverterTest}} become trivial.
> Also, that optimization is not valid if there are multiple group sets. 
> (Because we should output multiple rows, and Project only returns one.) So 
> disable it in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4220) In SqlToRelConverter, use RelBuilder for creating Aggregate

2020-09-02 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-4220:


 Summary: In SqlToRelConverter, use RelBuilder for creating 
Aggregate
 Key: CALCITE-4220
 URL: https://issues.apache.org/jira/browse/CALCITE-4220
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde


In {{SqlToRelConverter}} use {{RelBuilder}} for creating {{Aggregate}}. 
Currently we call LogicalAggregate.create directly, and that misses some 
optimizations.

Add config option {{RelBuilder.Config.skipAggOnUniqueKey()}} to disable the 
optimization that we don't create an {{Aggregate}} if the input is already 
unique on the GROUP BY key. Without it, some tests in {{SqlToRelConverterTest}} 
become trivial.

Also, that optimization is not valid if there are multiple group sets. (Because 
we should output multiple rows, and Project only returns one.) So disable it in 
that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4215:

Labels: pull-request-available  (was: )

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> ---
>
> Key: CALCITE-4215
> URL: https://issues.apache.org/jira/browse/CALCITE-4215
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4214) Make RelDataType.getSqlTypeName non-nullable

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189692#comment-17189692
 ] 

Vladimir Sitnikov commented on CALCITE-4214:


Just a note: I think I would keep the existing 
{{org.apache.calcite.rel.type.RelDataTypeImpl#getSqlTypeName}} method even 
though it is {{return null}}.

For now, I add javadoc that sub-classes must override the method (which is 
always the case for the current core)

> Make RelDataType.getSqlTypeName non-nullable
> 
>
> Key: CALCITE-4214
> URL: https://issues.apache.org/jira/browse/CALCITE-4214
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The code treated RelDataType.getSqlTypeName as non-nullable, so we need to 
> update the javadoc and deal with abstract implementation that returns null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4214) Make RelDataType.getSqlTypeName non-nullable

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4214:

Labels: pull-request-available  (was: )

> Make RelDataType.getSqlTypeName non-nullable
> 
>
> Key: CALCITE-4214
> URL: https://issues.apache.org/jira/browse/CALCITE-4214
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The code treated RelDataType.getSqlTypeName as non-nullable, so we need to 
> update the javadoc and deal with abstract implementation that returns null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4217) Unlock RelCrossType#getFieldCount()

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4217:

Labels: pull-request-available  (was: )

> Unlock RelCrossType#getFieldCount()
> ---
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently RelCrossType#getFieldCount throws an assertion error because 
> RelCrossType is not struct. 
> However, RelCrossType does return proper values for getFieldList and 
> getFieldNames, so it makes sense to unlock getFieldCount.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-2040) Create adapter for Apache Arrow

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-2040:

Labels: pull-request-available  (was: )

> Create adapter for Apache Arrow
> ---
>
> Key: CALCITE-2040
> URL: https://issues.apache.org/jira/browse/CALCITE-2040
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Create an adapter for [Apache Arrow|http://arrow.apache.org/]. This would 
> allow people to execute SQL statements, via JDBC or ODBC, on data stored in 
> Arrow in-memory format.
> Since Arrow is an in-memory format, it is not as straightforward as reading, 
> say, CSV files using the file adapter: an Arrow data set does not have a URL. 
> (Unless we use Arrow's 
> [Feather|https://blog.cloudera.com/blog/2016/03/feather-a-fast-on-disk-format-for-data-frames-for-r-and-python-powered-by-apache-arrow/]
>  format, or use an in-memory file system such as Alluxio.) So we would need 
> to devise a way of addressing Arrow data sets.
> Also, since Arrow is an extremely efficient format for processing data, it 
> would also be good to have Arrow as a calling convention. That is, 
> implementations of relational operators such as Filter, Project, Aggregate in 
> addition to just TableScan.
> Lastly, when we have an Arrow convention, if we build adapters for file 
> formats (for instance the bioinformatics formats SAM, VCF, FASTQ discussed in 
> CALCITE-2025) it would make a lot of sense to translate those formats 
> directly into Arrow (applying simple projects and filters first if 
> applicable). Those adapters would belong as a "contrib" module in the Arrow 
> project better than in Calcite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4217) Unlock RelCrossType#getFieldCount()

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4217:
---
Description: 
Currently RelCrossType#getFieldCount throws an assertion error because 
RelCrossType is not struct. 
However, RelCrossType does return proper values for getFieldList and 
getFieldNames, so it makes sense to unlock getFieldCount.

  was:
Currently RelCrossType#isStruct returns false even though the default 
implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
{{return fieldList != null}}

That results in {{getFieldList}} throwing assertion error (because it asserts 
{{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
{{fieldList != null}}

I think it should be OK if we remove {{isStruct() { return false; } }} override 
and use {{isStruct}} assertion in both {{getFieldList}} and {{getFieldNames}}


> Unlock RelCrossType#getFieldCount()
> ---
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#getFieldCount throws an assertion error because 
> RelCrossType is not struct. 
> However, RelCrossType does return proper values for getFieldList and 
> getFieldNames, so it makes sense to unlock getFieldCount.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4217) Unlock RelCrossType#getFieldCount()

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4217:
---
Summary: Unlock RelCrossType#getFieldCount()  (was: RelCrossType#isStruct 
should be true rather than false)

> Unlock RelCrossType#getFieldCount()
> ---
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189670#comment-17189670
 ] 

Vladimir Sitnikov edited comment on CALCITE-4217 at 9/2/20, 7:57 PM:
-

{quote}People who are trying to treat a cross type as a struct need to know 
that they are doing the wrong thing.{quote}

Just to double-check, my current proposal is:
1) Leave RelCrossType#isStruct as is: return false
2) Replace {{assert isStruct}} in {{RelDataTypeImpl#getFieldList}} and 
{{RelDataTypeImpl#getFieldCount}} with {{assert fieldList != null : "fieldList 
is null for " + this}}

That would make getFieldCount work for CrossType, and CrossType.isStruct() 
would still be {{false}}

WDYT?


was (Author: vladimirsitnikov):
{quote}People who are trying to treat a cross type as a struct need to know 
that they are doing the wrong thing.{quote}

Just to double-check, my current proposal is:
1) Leave RelCrossType#isStruct as is: return false
2) Replace {{assert isStruct}} in {{RelDataTypeImpl#getFieldList}} and 
{{RelDataTypeImpl#getFieldCount}} with {{assert fieldList != null : "fieldList 
is null for " + this}}

That would make getFieldCount and getFieldList work for CrossType, and 
CrossType.isStruct() would still be {{false}}

WDYT?

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189683#comment-17189683
 ] 

Vladimir Sitnikov commented on CALCITE-4217:


I fail to realize why is it important that CrossType had {{getFieldList()}}  
and {{getFieldNames()}} while it must throw assertion error from 
{{getFieldCount()}}.
It looks like a non-sense.

Just in case, currently getFieldList and getFieldNames do already work for 
CrossType. The missing piece is getFieldCount

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189674#comment-17189674
 ] 

Julian Hyde commented on CALCITE-4217:
--

People shouldn't be calling getFieldList on a CrossType. The field names are 
not unique. They do not have contiguous indexes. Do you understand what a 
CrossType is?

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189670#comment-17189670
 ] 

Vladimir Sitnikov commented on CALCITE-4217:


{quote}People who are trying to treat a cross type as a struct need to know 
that they are doing the wrong thing.{quote}

Just to double-check, my current proposal is:
1) Leave RelCrossType#isStruct as is: return false
2) Replace {{assert isStruct}} in {{RelDataTypeImpl#getFieldList}} and 
{{RelDataTypeImpl#getFieldCount}} with {{assert fieldList != null : "fieldList 
is null for " + this}}

That would make getFieldCount and getFieldList work for CrossType, and 
CrossType.isStruct() would still be {{false}}

WDYT?

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189672#comment-17189672
 ] 

Julian Hyde commented on CALCITE-4219:
--

Your mistake is trying to use a verification framework. Reality is too complex 
to model. The assumptions made in the code are valid, in the vast majority of 
cases, because the code is written by people who understand the actual 
constraints.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4218) Clarify RelMetadataQuery#getDistribution nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189669#comment-17189669
 ] 

Vladimir Sitnikov commented on CALCITE-4218:


Ok, let me see what happens if I specify {{@Nullable}} for all the 
metadata-like methods, including distribution ones. I just thought that 
{{RelDistributions.ANY}} is exactly the same as {{null}}, and we could teach 
{{RelMetadataQuery}} facade to make {{null -> ANY}} conversion, so the 
resulting value is easier to use in Java.

> Clarify RelMetadataQuery#getDistribution nullability
> 
>
> Key: CALCITE-4218
> URL: https://issues.apache.org/jira/browse/CALCITE-4218
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
> want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189667#comment-17189667
 ] 

Julian Hyde commented on CALCITE-4217:
--

Yes it will do harm. People who are trying to treat a cross type as a struct 
need to know that they are doing the wrong thing.

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189665#comment-17189665
 ] 

Vladimir Sitnikov commented on CALCITE-4217:


Well, field names and field count information is present for cross type, and 
I'm sure it won't harm if we make it available via **existing** API methods.

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189654#comment-17189654
 ] 

Vladimir Sitnikov edited comment on CALCITE-4219 at 9/2/20, 7:33 PM:
-

{quote}I propose that we return lists of nullable values, and consumers just 
know which operands may be null{quote}

I tried that and it produces a LOT of false-positives. The current code has 
extreme amount of assumptions that the list contains non-nullable values only.

{quote}And also add javadoc to say that the list is never null but, depending 
on the operator, some of the operands may be null.{quote}
That is my current intention: no modifications to the logic, but the 
documentation should clarify the nature of the contents.


was (Author: vladimirsitnikov):
{quote}I propose that we return lists of nullable values, and consumers just 
know which operands may be null{quote}

I tried that and it produces a LOT of false-positives. Current code has extreme 
amount of assumptions that the list contains non-nullable values only.

{quote}And also add javadoc to say that the list is never null but, depending 
on the operator, some of the operands may be null.{quote}
That is my current intention: no modifications to the logic, but the 
documentation should clarify the nature of the contents.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189653#comment-17189653
 ] 

Julian Hyde commented on CALCITE-4217:
--

I don't know. It's working, it's not widely used, don't mess with it. I don't 
have time to try to resolve the inconsistencies, but let's not make things 
worse.

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189654#comment-17189654
 ] 

Vladimir Sitnikov commented on CALCITE-4219:


{quote}I propose that we return lists of nullable values, and consumers just 
know which operands may be null{quote}

I tried that and it produces a LOT of false-positives. Current code has extreme 
amount of assumptions that the list contains non-nullable values only.

{quote}And also add javadoc to say that the list is never null but, depending 
on the operator, some of the operands may be null.{quote}
That is my current intention: no modifications to the logic, but the 
documentation should clarify the nature of the contents.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189651#comment-17189651
 ] 

Vladimir Sitnikov commented on CALCITE-4217:


An alternative option is to remove {{assert isStruct()}} from 
{{RelDataTypeImpl#getFieldList}} and {{getFieldCount()}}

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4218) Clarify RelMetadataQuery#getDistribution nullability

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189649#comment-17189649
 ] 

Julian Hyde commented on CALCITE-4218:
--

All {{RelMetadataQuery}} methods are allowed to return null. 
{{RelMetadataQuery}} is a facade on top of providers, and they may have 
incomplete and imperfect information.

I don't know whether CALCITE-3969 broke this.

> Clarify RelMetadataQuery#getDistribution nullability
> 
>
> Key: CALCITE-4218
> URL: https://issues.apache.org/jira/browse/CALCITE-4218
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
> want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189650#comment-17189650
 ] 

Vladimir Sitnikov commented on CALCITE-4217:


Should {{RelCrossType}} override {{getFieldNames}} and {{getFieldCount}} then?

It is strange that {{getFieldList}} works, while both {{getFieldNames}} and 
{{getFieldCount}} always throw. Do you really intend that should be like that?

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4218) Clarify RelMetadataQuery#getDistribution nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189648#comment-17189648
 ] 

Vladimir Sitnikov commented on CALCITE-4218:


I mean methods like 
{{org.apache.calcite.rel.metadata.RelMetadataQuery#getDistribution}} and 
methods like 
{{org.apache.calcite.rel.metadata.RelMdDistribution#distribution(org.apache.calcite.rel.core.TableScan,
 org.apache.calcite.rel.metadata.RelMetadataQuery)}}

I know there was a fix recently by [~rkondakov]: 
https://issues.apache.org/jira/browse/CALCITE-3969 which goes for ANY as a 
default distribution.

If you think we should keep ANY vs null distinction for getDistribution, then 
we should probably revert CALCITE-3969 or do something with it.

> Clarify RelMetadataQuery#getDistribution nullability
> 
>
> Key: CALCITE-4218
> URL: https://issues.apache.org/jira/browse/CALCITE-4218
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
> want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189646#comment-17189646
 ] 

Julian Hyde commented on CALCITE-4219:
--

I propose that we return lists of nullable values, and consumers just know 
which operands may be null. That's what we do today.

And also add javadoc to say that the list is never null but, depending on the 
operator, some of the operands may be null.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4218) Clarify RelMetadataQuery#getDistribution nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4218:
---
Summary: Clarify RelMetadataQuery#getDistribution nullability  (was: 
Clarify RelMetadataQuery#RelDistribution nullability)

> Clarify RelMetadataQuery#getDistribution nullability
> 
>
> Key: CALCITE-4218
> URL: https://issues.apache.org/jira/browse/CALCITE-4218
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
> want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189643#comment-17189643
 ] 

Vladimir Sitnikov commented on CALCITE-4219:


{quote} The other option is to accept that the rules are different for 
different operators{quote}

Which is "workaround" I mention: we could document that {{getOperandList()}} 
could have null values in the list, we could document that {{operand(int)}} 
could return {{null}} even though the method.

The methods are very popular, however, their documentation is empty (no javadoc 
at all), and it is not obvious if nulls are possible there or not.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189643#comment-17189643
 ] 

Vladimir Sitnikov edited comment on CALCITE-4219 at 9/2/20, 7:17 PM:
-

{quote} The other option is to accept that the rules are different for 
different operators{quote}

Which is "workaround" I mention: we could document that {{getOperandList()}} 
could have null values in the list, we could document that {{operand(int)}} 
could return {{null}} even though the methods look like non-nullable.

The methods are very popular, however, their documentation is empty (no javadoc 
at all), and it is not obvious if nulls are possible there or not.


was (Author: vladimirsitnikov):
{quote} The other option is to accept that the rules are different for 
different operators{quote}

Which is "workaround" I mention: we could document that {{getOperandList()}} 
could have null values in the list, we could document that {{operand(int)}} 
could return {{null}} even though the method.

The methods are very popular, however, their documentation is empty (no javadoc 
at all), and it is not obvious if nulls are possible there or not.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189641#comment-17189641
 ] 

Julian Hyde commented on CALCITE-4217:
--

No, a cross type is not a struct.

> RelCrossType#isStruct should be true rather than false
> --
>
> Key: CALCITE-4217
> URL: https://issues.apache.org/jira/browse/CALCITE-4217
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Currently RelCrossType#isStruct returns false even though the default 
> implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
> {{return fieldList != null}}
> That results in {{getFieldList}} throwing assertion error (because it asserts 
> {{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
> {{fieldList != null}}
> I think it should be OK if we remove {{isStruct() { return false; } }} 
> override and use {{isStruct}} assertion in both {{getFieldList}} and 
> {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4218) Clarify RelMetadataQuery#RelDistribution nullability

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189640#comment-17189640
 ] 

Julian Hyde commented on CALCITE-4218:
--

You can't make a class non-nullable. I guess you mean a particular method. 
Metadata methods generally return null to mean 'not known'.

> Clarify RelMetadataQuery#RelDistribution nullability
> 
>
> Key: CALCITE-4218
> URL: https://issues.apache.org/jira/browse/CALCITE-4218
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
> want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189639#comment-17189639
 ] 

Julian Hyde commented on CALCITE-4219:
--

The other option is to accept that the rules are different for different 
operators. It's not easy to model. Which is why we should not try to model it.

> Clarify SqlCall#getOperandList() nullability
> 
>
> Key: CALCITE-4219
> URL: https://issues.apache.org/jira/browse/CALCITE-4219
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getOperandList()}} is implemented by a number of different {{SqlCall}} 
> sub-classes, and the list often includes null values.
> The implementation is typically {{return ImmutableNullableList.of(...}}
> However, {{getOperandList()}} is used a lot, and the code assumes that the 
> resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.
> The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
> return non-nullable values (even though nulls are possible), and we could add 
> suppressions at the side which returns {{ImmutableNullableList}}.
> The solution might be behind the lines of adding 
> {{nonNullableOperand(int)}}-like methods that would return non-nullable 
> values and throw meaningful errors in case the value turns out to be null.
> Another option might be to ensure all the fields are non-nullable (e.g. 
> create dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4219) Clarify SqlCall#getOperandList() nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4219:
--

 Summary: Clarify SqlCall#getOperandList() nullability
 Key: CALCITE-4219
 URL: https://issues.apache.org/jira/browse/CALCITE-4219
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


{{getOperandList()}} is implemented by a number of different {{SqlCall}} 
sub-classes, and the list often includes null values.
The implementation is typically {{return ImmutableNullableList.of(...}}

However, {{getOperandList()}} is used a lot, and the code assumes that the 
resulting values are non-null. The same happens for {{SqlCall#operand(int)}}.

The workaround is to declare  {{getOperandList()}} and {{operand(int)}} to 
return non-nullable values (even though nulls are possible), and we could add 
suppressions at the side which returns {{ImmutableNullableList}}.

The solution might be behind the lines of adding 
{{nonNullableOperand(int)}}-like methods that would return non-nullable values 
and throw meaningful errors in case the value turns out to be null.

Another option might be to ensure all the fields are non-nullable (e.g. create 
dummy SqlNode for empty values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4218) Clarify RelMetadataQuery#RelDistribution nullability

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4218:
--

 Summary: Clarify RelMetadataQuery#RelDistribution nullability
 Key: CALCITE-4218
 URL: https://issues.apache.org/jira/browse/CALCITE-4218
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


It looks like {{null}} is the same as {{RelDistributions.ANY}}, so we might 
want to make {{RelDistribution}} non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4217) RelCrossType#isStruct should be true rather than false

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4217:
--

 Summary: RelCrossType#isStruct should be true rather than false
 Key: CALCITE-4217
 URL: https://issues.apache.org/jira/browse/CALCITE-4217
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


Currently RelCrossType#isStruct returns false even though the default 
implementation in {{org.apache.calcite.rel.type.RelDataTypeImpl#isStruct}} is 
{{return fieldList != null}}

That results in {{getFieldList}} throwing assertion error (because it asserts 
{{isStruct}}), however RelCrossType#getFieldNames works OK because it asserts 
{{fieldList != null}}

I think it should be OK if we remove {{isStruct() { return false; } }} override 
and use {{isStruct}} assertion in both {{getFieldList}} and {{getFieldNames}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4216) Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4216:
---
Description: {{getFamily}} is almost always non-nullable, so we should make 
that explicit  (was: It looks like )

> Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable
> ---
>
> Key: CALCITE-4216
> URL: https://issues.apache.org/jira/browse/CALCITE-4216
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> {{getFamily}} is almost always non-nullable, so we should make that explicit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4216) Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4216:
---
Description: It looks like 

> Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable
> ---
>
> Key: CALCITE-4216
> URL: https://issues.apache.org/jira/browse/CALCITE-4216
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> It looks like 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4216) Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4216:
---
Summary: Make org.apache.calcite.rel.type.RelDataType#getFamily 
non-nullable  (was: TBD: will be updated later)

> Make org.apache.calcite.rel.type.RelDataType#getFamily non-nullable
> ---
>
> Key: CALCITE-4216
> URL: https://issues.apache.org/jira/browse/CALCITE-4216
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4216) TBD: will be updated later

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4216:
---
Description: (was: Existing code treats RelDataType#getSqlTypeName as 
non-nullable, and org.apache.calcite.rel.type.RelDataTypeImpl#getSqlTypeName 
implements the method as {{return null}}, howeve, RelDataTypeImpl is abstract.

We should make it explicit that getSqlTypeName should be non-null.
)

> TBD: will be updated later
> --
>
> Key: CALCITE-4216
> URL: https://issues.apache.org/jira/browse/CALCITE-4216
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4216) TBD: will be updated later

2020-09-02 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CALCITE-4216:
---
Summary: TBD: will be updated later  (was: Ensure 
org.apache.calcite.rel.type.RelDataType#getSqlTypeName is non-null)

> TBD: will be updated later
> --
>
> Key: CALCITE-4216
> URL: https://issues.apache.org/jira/browse/CALCITE-4216
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> Existing code treats RelDataType#getSqlTypeName as non-nullable, and 
> org.apache.calcite.rel.type.RelDataTypeImpl#getSqlTypeName implements the 
> method as {{return null}}, howeve, RelDataTypeImpl is abstract.
> We should make it explicit that getSqlTypeName should be non-null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189623#comment-17189623
 ] 

Julian Hyde commented on CALCITE-4211:
--

bq. I have an idea to resolve it if this is quite a serious issue

It would be great if you could fix this. Please also change the description so 
that it looks like a new feature.

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Assignee: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4216) Ensure org.apache.calcite.rel.type.RelDataType#getSqlTypeName is non-null

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4216:
--

 Summary: Ensure 
org.apache.calcite.rel.type.RelDataType#getSqlTypeName is non-null
 Key: CALCITE-4216
 URL: https://issues.apache.org/jira/browse/CALCITE-4216
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


Existing code treats RelDataType#getSqlTypeName as non-nullable, and 
org.apache.calcite.rel.type.RelDataTypeImpl#getSqlTypeName implements the 
method as {{return null}}, howeve, RelDataTypeImpl is abstract.

We should make it explicit that getSqlTypeName should be non-null.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4215) Ensure org.apache.calcite.schema.Statistic uses null vs emptyList appropriately

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4215:
--

 Summary: Ensure org.apache.calcite.schema.Statistic uses null vs 
emptyList appropriately
 Key: CALCITE-4215
 URL: https://issues.apache.org/jira/browse/CALCITE-4215
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


null: statistic is *not* *known*
emptyList: statistic is *known*, and the value is *empty* (e.g. no unique keys 
in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4214) Make RelDataType.getSqlTypeName non-nullable

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189586#comment-17189586
 ] 

Julian Hyde commented on CALCITE-4214:
--

Makes sense.

> Make RelDataType.getSqlTypeName non-nullable
> 
>
> Key: CALCITE-4214
> URL: https://issues.apache.org/jira/browse/CALCITE-4214
> Project: Calcite
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> The code treated RelDataType.getSqlTypeName as non-nullable, so we need to 
> update the javadoc and deal with abstract implementation that returns null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189585#comment-17189585
 ] 

Julian Hyde commented on CALCITE-4211:
--

Your bug implies that there has been a regression. There has not. I did the 
research that you should have done:
* The test method is called {{testOrderByQueryOnOrderByView()}}.
* It is in {{class MaterializedViewSubstitutionVisitorTest}}.
* The test is disabled.
* The test was moved from another class in CALCITE-3478 but was disabled before 
then.

I didn't look back any further, but I suspect that this test never worked. If 
so, this should not be a bug but a feature request.

You logged an alarming bug, and it wasted a bunch of my time. Please do better.

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Assignee: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189576#comment-17189576
 ] 

Julian Hyde commented on CALCITE-4197:
--

Looks good. How about reversing the arrows, so that they match data flow?

Since real estate is at a premium:
* Could we remove the arc labels? (0, 1)
* I think you could remove the parentheses and commas. E.g. 
"LogicalProject(JOB=[$2], SAL=[$5])" becomes "LogicalProject JOB=[$2] SAL=[$5]".
* Maybe we can remove the brackets in many cases, so it becomes "LogicalProject 
JOB=$2 SAL=$5"

> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4214) Make RelDataType.getSqlTypeName non-nullable

2020-09-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4214:
--

 Summary: Make RelDataType.getSqlTypeName non-nullable
 Key: CALCITE-4214
 URL: https://issues.apache.org/jira/browse/CALCITE-4214
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.25.0
Reporter: Vladimir Sitnikov


The code treated RelDataType.getSqlTypeName as non-nullable, so we need to 
update the javadoc and deal with abstract implementation that returns null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4199) Add nullability annotations to the methods and fields, ensure consistency with checkerframework

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189374#comment-17189374
 ] 

Julian Hyde commented on CALCITE-4199:
--

Yes, we should update the javadoc for RelDataType.getSqlTypeName(). It is not 
null, and we have been treating it as such for years.

Statistic.getKeys() should return null if the keys are not known. It should 
return an empty list if the keys are known but there aren't any. The javadoc 
"Returns a list of unique keys, or null if no key exist" is wrong and should be 
fixed. (There was a tradition in C to return null for empty collections, but 
that tradition has no place in Java.)

> Add nullability annotations to the methods and fields, ensure consistency 
> with checkerframework
> ---
>
> Key: CALCITE-4199
> URL: https://issues.apache.org/jira/browse/CALCITE-4199
> Project: Calcite
>  Issue Type: New Feature
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The current codebase uses jsr305 javax.annotation.Nullable / NonNull which 
> helps to catch bugs while developing Calcite and libraries.
> Unfortunately, the current annotation is not enforced, and it lacks support 
> for generics.
> jsr305.jar is probably abandoned (see 
> https://github.com/google/guava/issues/2960), so we should probably migrate 
> to something else.
> https://checkerframework.org/ is a solid framework for machine verification 
> which is tailored to Java.
> The releases are quite frequent: 
> https://github.com/typetools/checker-framework/releases
> Their annotations are recognized by major IDEs.
> So I see the following options:
> a) Leave the code as is
> b) Annotate (gradually?) the code with checkerframework annotations
> c) Migrate (gradually?) the code to Kotlin
> I've created a PR to verify which changes would be needed, verify if CI is 
> able to type check in a reasonable time, and so on: 
> https://github.com/apache/calcite/pull/1929
> My findings regarding checkerframework so far are:
> 0) It does detect NPEs (which were hidden in the current code), and it does 
> detect initialized files in the constructors
> 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} 
> in GitHub Actions CI
> 2) "non-nullable by default" is quite close to the current Calcite 
> conventions.
> 3) There are cases when javadoc comments says "or null", however, the code 
> reads much easier if {{@Nullable}} nullable appears in the signature
> 4) If a third-party library supplies invalid type annotations, there's a way 
> to fix that. For instance, Guava's Function is annotated as "always 
> nullable", and we can overrule that (so the nullability is taken from generic 
> signature rather than "always nullable"). The override files are placed to 
> src/main/config/checkerframework
> 5) Generic-heavy code might be challenging (they are always like that), 
> however, in the most obscure cases there's always a way to suppress the 
> warning
> 6) I've filed a Gradle improvement so it schedules recently modified files 
> first for the compilation: https://github.com/gradle/gradle/issues/14332
> ---
> h2. Option a: leave the code as is
> Typically, Calcite code does have decent documentation if {{null}} is 
> permissible, however, I often need to check the existing calls in order to 
> figure out if {{null}} is expected.
> Note: nullability does not mean just "null vs non-null" value, but it is 
> related to other common pitfalls like "failing to initialize the value in 
> constructor", "calling non-final method in the constructor"
> h2. Option b: annotate (gradually?) the code with checkerframework annotations
> Pros:
> * Checkerframework verifies one method at a time, so the resulting code is 
> easy to reason about. For instance, it requires that {{boolean 
> equals(@Nullable Object other)}} overloads must explicitly declare 
> {{@Nulable}} annotation. That might sound too verbose at first, however, it 
> helps to read the code: you don't need to analyze class/interface 
> declarations in order to figure out if nulls are expected or not.
> * Checkerframework supports nullability with generics. For instance, {{class 
> Wrapper}} means T can be used as either non-nullable or nullable. {{class 
> Wrapper}} means T is always nullable (all implementations can assume T is 
> always nullable), and {{class Wrapper}} (nonnull 
> is redundant in this case and can be omitted) means T can't be null.
> * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} 
> is non-nullable array with nullable elements, {{Object @Nullable []}} is 
> nullable array with non-nullable eleme

[jira] [Commented] (CALCITE-4198) Query contains AVG function, materialized view contains SUM and COUNT functions, and materialized recognition fails

2020-09-02 Thread zhishui (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189331#comment-17189331
 ] 

zhishui commented on CALCITE-4198:
--

I have resolve this problem in [https://github.com/apache/calcite/pull/2132 
|https://github.com/apache/calcite/pull/2132] and I add your test in 
MaterializedViewSubstitutionVisitorTest.If you think there are some problem, 
please tell me.

 

> Query contains AVG function, materialized view contains SUM and COUNT 
> functions, and materialized recognition fails
> ---
>
> Key: CALCITE-4198
> URL: https://issues.apache.org/jira/browse/CALCITE-4198
> Project: Calcite
>  Issue Type: Wish
>Reporter: xzh_dz
>Assignee: zhishui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a semantic equivalent case, which should be able to materialize and 
> identify successfully, materialized view can express the calculation logic of 
> query,but failed. This case can be reproduced as below.
> {code:java}
> // MaterializedViewSubstitutionVisitorTest
> @Test void testAvgMvMatch() {
>   sql(
>   "select \"empid\", sum(\"salary\"), count(\"salary\") from \"emps\" 
> where \"empid\" > 10 group by \"empid\"",
>   "select \"empid\", avg(\"salary\") from \"emps\" where \"empid\" > 10 
> group by \"empid\"")
>   .ok();
> }
> {code}
> The error message:
> {code:java}
> java.lang.AssertionError: Materialized view failed to be matched by optimized 
> results:java.lang.AssertionError: Materialized view failed to be matched by 
> optimized results: at 
> org.apache.calcite.test.AbstractMaterializedViewTest.checkMaterialize(AbstractMaterializedViewTest.java:112)
>  at 
> org.apache.calcite.test.AbstractMaterializedViewTest.access$000(AbstractMaterializedViewTest.java:64)
>  at 
> org.apache.calcite.test.AbstractMaterializedViewTest$Sql.ok(AbstractMaterializedViewTest.java:227)
>  at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627) 
> at 
> org.apache.calcite.util.ImmutableBeans.lambda$makeDef$3(ImmutableBeans.java:275)
>  at 
> org.apache.calcite.util.ImmutableBeans$BeanImpl.invoke(ImmutableBeans.java:446)
>  at com.sun.proxy.$Proxy12.ok(Unknown Source) at 
> org.apache.calcite.test.MaterializedViewSubstitutionVisitorTest.testAvgMvMatch(MaterializedViewSubstitutionVisitorTest.java:101)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4198) Query contains AVG function, materialized view contains SUM and COUNT functions, and materialized recognition fails

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4198:

Labels: pull-request-available  (was: )

> Query contains AVG function, materialized view contains SUM and COUNT 
> functions, and materialized recognition fails
> ---
>
> Key: CALCITE-4198
> URL: https://issues.apache.org/jira/browse/CALCITE-4198
> Project: Calcite
>  Issue Type: Wish
>Reporter: xzh_dz
>Assignee: zhishui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a semantic equivalent case, which should be able to materialize and 
> identify successfully, materialized view can express the calculation logic of 
> query,but failed. This case can be reproduced as below.
> {code:java}
> // MaterializedViewSubstitutionVisitorTest
> @Test void testAvgMvMatch() {
>   sql(
>   "select \"empid\", sum(\"salary\"), count(\"salary\") from \"emps\" 
> where \"empid\" > 10 group by \"empid\"",
>   "select \"empid\", avg(\"salary\") from \"emps\" where \"empid\" > 10 
> group by \"empid\"")
>   .ok();
> }
> {code}
> The error message:
> {code:java}
> java.lang.AssertionError: Materialized view failed to be matched by optimized 
> results:java.lang.AssertionError: Materialized view failed to be matched by 
> optimized results: at 
> org.apache.calcite.test.AbstractMaterializedViewTest.checkMaterialize(AbstractMaterializedViewTest.java:112)
>  at 
> org.apache.calcite.test.AbstractMaterializedViewTest.access$000(AbstractMaterializedViewTest.java:64)
>  at 
> org.apache.calcite.test.AbstractMaterializedViewTest$Sql.ok(AbstractMaterializedViewTest.java:227)
>  at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627) 
> at 
> org.apache.calcite.util.ImmutableBeans.lambda$makeDef$3(ImmutableBeans.java:275)
>  at 
> org.apache.calcite.util.ImmutableBeans$BeanImpl.invoke(ImmutableBeans.java:446)
>  at com.sun.proxy.$Proxy12.ok(Unknown Source) at 
> org.apache.calcite.test.MaterializedViewSubstitutionVisitorTest.testAvgMvMatch(MaterializedViewSubstitutionVisitorTest.java:101)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3920) Improve ORDER BY computation in Enumerable convention by exploiting LIMIT

2020-09-02 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-3920:
---
Fix Version/s: 1.26.0

> Improve ORDER BY computation in Enumerable convention by exploiting LIMIT
> -
>
> Key: CALCITE-3920
> URL: https://issues.apache.org/jira/browse/CALCITE-3920
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Stamatis Zampetakis
>Assignee: Thomas Rebele
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> There are many use-cases (pagination, top-k) relying on queries with an ORDER 
> BY clause followed by a LIMIT. 
> At the moment, the two operations are implemented independently the one from 
> the other 
>  in the Enumerable convention. Even when we know that consumer needs only the 
> top-10 results the sort operation will try to maintain its entire input 
> sorted. The complexity of the sorting operation is O( n ) space and O( nlogn 
> ) time, where n is the size of the input. 
> By implementing ORDER BY and LIMIT together there are various optimizations 
> that can be applied to reduce the space and time complexity of the sorting 
> algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4208) Improve metadata row count for Join

2020-09-02 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-4208:
---
Fix Version/s: 1.26.0

> Improve metadata row count for Join
> ---
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4199) Add nullability annotations to the methods and fields, ensure consistency with checkerframework

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4199:

Labels: pull-request-available  (was: )

> Add nullability annotations to the methods and fields, ensure consistency 
> with checkerframework
> ---
>
> Key: CALCITE-4199
> URL: https://issues.apache.org/jira/browse/CALCITE-4199
> Project: Calcite
>  Issue Type: New Feature
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current codebase uses jsr305 javax.annotation.Nullable / NonNull which 
> helps to catch bugs while developing Calcite and libraries.
> Unfortunately, the current annotation is not enforced, and it lacks support 
> for generics.
> jsr305.jar is probably abandoned (see 
> https://github.com/google/guava/issues/2960), so we should probably migrate 
> to something else.
> https://checkerframework.org/ is a solid framework for machine verification 
> which is tailored to Java.
> The releases are quite frequent: 
> https://github.com/typetools/checker-framework/releases
> Their annotations are recognized by major IDEs.
> So I see the following options:
> a) Leave the code as is
> b) Annotate (gradually?) the code with checkerframework annotations
> c) Migrate (gradually?) the code to Kotlin
> I've created a PR to verify which changes would be needed, verify if CI is 
> able to type check in a reasonable time, and so on: 
> https://github.com/apache/calcite/pull/1929
> My findings regarding checkerframework so far are:
> 0) It does detect NPEs (which were hidden in the current code), and it does 
> detect initialized files in the constructors
> 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} 
> in GitHub Actions CI
> 2) "non-nullable by default" is quite close to the current Calcite 
> conventions.
> 3) There are cases when javadoc comments says "or null", however, the code 
> reads much easier if {{@Nullable}} nullable appears in the signature
> 4) If a third-party library supplies invalid type annotations, there's a way 
> to fix that. For instance, Guava's Function is annotated as "always 
> nullable", and we can overrule that (so the nullability is taken from generic 
> signature rather than "always nullable"). The override files are placed to 
> src/main/config/checkerframework
> 5) Generic-heavy code might be challenging (they are always like that), 
> however, in the most obscure cases there's always a way to suppress the 
> warning
> 6) I've filed a Gradle improvement so it schedules recently modified files 
> first for the compilation: https://github.com/gradle/gradle/issues/14332
> ---
> h2. Option a: leave the code as is
> Typically, Calcite code does have decent documentation if {{null}} is 
> permissible, however, I often need to check the existing calls in order to 
> figure out if {{null}} is expected.
> Note: nullability does not mean just "null vs non-null" value, but it is 
> related to other common pitfalls like "failing to initialize the value in 
> constructor", "calling non-final method in the constructor"
> h2. Option b: annotate (gradually?) the code with checkerframework annotations
> Pros:
> * Checkerframework verifies one method at a time, so the resulting code is 
> easy to reason about. For instance, it requires that {{boolean 
> equals(@Nullable Object other)}} overloads must explicitly declare 
> {{@Nulable}} annotation. That might sound too verbose at first, however, it 
> helps to read the code: you don't need to analyze class/interface 
> declarations in order to figure out if nulls are expected or not.
> * Checkerframework supports nullability with generics. For instance, {{class 
> Wrapper}} means T can be used as either non-nullable or nullable. {{class 
> Wrapper}} means T is always nullable (all implementations can assume T is 
> always nullable), and {{class Wrapper}} (nonnull 
> is redundant in this case and can be omitted) means T can't be null.
> * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} 
> is non-nullable array with nullable elements, {{Object @Nullable []}} is 
> nullable array with non-nullable elements.
> * The verifier sees untested code
> * checkerframework has a coherent set of verifiers, and it is pluggable
> * checkerframework compiler understand lots of different nullability 
> annotations, so we could use jsr305 nullability annotations instead of the 
> ones from checkerframework
> * Nullability annotations might make it easier to contribute: one has to 
> reason about nulls anyway, and the annotations make it simpler.
> Cons:
> * Nullability annotations add an extra burden on de

[jira] [Updated] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Liya Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan updated CALCITE-4197:
--
Attachment: screenshot-5.png

> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Liya Fan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189151#comment-17189151
 ] 

Liya Fan commented on CALCITE-4197:
---

New screen shots after accepting [~danny0405] and [~Chunwei Lei]'s suggestions

 !screenshot-4.png! 

 !screenshot-5.png! 

> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4197) Provide utility to visualize RelNode plans

2020-09-02 Thread Liya Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan updated CALCITE-4197:
--
Attachment: screenshot-4.png

> Provide utility to visualize RelNode plans
> --
>
> Key: CALCITE-4197
> URL: https://issues.apache.org/jira/browse/CALCITE-4197
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> From time to time, we want to visualize the plan generated the optimizer as a 
> DAG (Directed Acyclic Graph). This gives us an overview of the plan, and 
> helps us to find problems with the plan quickly.
> A common way to visualize a DAG is to dump it in the {{dot}} format, and then 
> display the DAG through GraphViz. Currently, we already have a utility to 
> dump the Volcano planner in dot format. In this issue, we want to dump a 
> RelNode plan (a RelNode DAG) in the dot format, which will help users to see 
> their plans more clearly. 
> The utility should have some options that allow users to specify the 
> visuzalization details, for example:
> 1. the max length of a node description.
> 2. whether RelSubset/HepRelVertex should be displayed.
> 3. Nodes that should be highlighted. 
> Could you please give some feedback?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4213) Druid plans with small intervals should be chosen over full interval scan plus filter

2020-09-02 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189143#comment-17189143
 ] 

Stamatis Zampetakis commented on CALCITE-4213:
--

>From a quick look it seems again a result of plans having exactly the same 
>cost but one dominates the other due to the order that rules are applied. I 
>mark it as a bug and I guess as a regression since the test used to work at 
>some point. 

I assume that fixing this requires again tweaking the cost model in Druid.

> Druid plans with small intervals should be chosen over full interval scan 
> plus filter
> -
>
> Key: CALCITE-4213
> URL: https://issues.apache.org/jira/browse/CALCITE-4213
> Project: Calcite
>  Issue Type: Bug
>  Components: druid-adapter
>Reporter: Stamatis Zampetakis
>Priority: Major
>
> The problem was observed due to the failure of 
> DruidAdapterIT#testFilterTimestamp.
> {code:sql}
>  select count(*) as c
> from "foodmart"
> where extract(year from "timestamp") = 1997
> and extract(month from "timestamp") in (4, 6)
> {code}
> +Expected+
> {noformat}
> EnumerableInterpreter
>  DruidQuery(table=[[foodmart, foodmart]], 
> intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 
> 1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], 
> groups=[{}], aggs=[[COUNT()]])
> {noformat}
> +Actual+
> {noformat}
> EnumerableInterpreter
>   DruidQuery(table=[[foodmart, foodmart]], 
> intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], 
> filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 
> 4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])
> {noformat}
> Observe that the actual plan has an interval that basically touches all data 
> so in most cases it is less efficient than the expected one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4213) Druid plans with small intervals should be chosen over full interval scan plus filter

2020-09-02 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-4213:


 Summary: Druid plans with small intervals should be chosen over 
full interval scan plus filter
 Key: CALCITE-4213
 URL: https://issues.apache.org/jira/browse/CALCITE-4213
 Project: Calcite
  Issue Type: Bug
  Components: druid-adapter
Reporter: Stamatis Zampetakis


The problem was observed due to the failure of 
DruidAdapterIT#testFilterTimestamp.
{code:sql}
 select count(*) as c
from "foodmart"
where extract(year from "timestamp") = 1997
and extract(month from "timestamp") in (4, 6)
{code}
+Expected+
{noformat}
EnumerableInterpreter
 DruidQuery(table=[[foodmart, foodmart]], 
intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 
1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], 
groups=[{}], aggs=[[COUNT()]])
{noformat}
+Actual+
{noformat}
EnumerableInterpreter
  DruidQuery(table=[[foodmart, foodmart]], 
intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], 
filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 
4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])
{noformat}

Observe that the actual plan has an interval that basically touches all data so 
in most cases it is less efficient than the expected one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4208) Improve metadata row count for Join

2020-09-02 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reassigned CALCITE-4208:
--

Assignee: Ruben Q L

> Improve metadata row count for Join
> ---
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4207) Validation fails for positional aggregate with current_date in 'case' expression

2020-09-02 Thread Igor Guzenko (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189115#comment-17189115
 ] 

Igor Guzenko commented on CALCITE-4207:
---

[~julianhyde] I've updated the test case here and in 
[PR|https://github.com/apache/calcite/pull/2130], please take a look. 

> Validation fails for positional aggregate with current_date in 'case' 
> expression
> 
>
> Key: CALCITE-4207
> URL: https://issues.apache.org/jira/browse/CALCITE-4207
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce: *
>  Add test to SqlValidatorTest.java
> {code:java}
>   @Test void testPositionalAggregateWithExpandedCurrentDateFunction() {
> SqlConformance defaultPlusOrdinalGroupBy =
> new SqlDelegatingConformance(SqlConformanceEnum.DEFAULT) {
>   @Override public boolean isGroupByOrdinal() {
> return true;
>   }
> };
> sql("SELECT HIREDATE >= CURRENT_DATE, COUNT(*) "
> + "FROM EMP GROUP BY 1")
> .withConformance(defaultPlusOrdinalGroupBy)
> .ok();
>   }
> {code}
> *Expected result*
> Validation passed successfully.
> *Actual result*
> {code:none}org.apache.calcite.sql.validate.SqlValidatorException: Expression 
> 'EMP.HIREDATE' is not being grouped
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>   at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5003)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:113)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:321)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor.visit(SqlBasicVisitor.java:52)
>   at org.apache.calcite.sql.SqlNodeList.accept(SqlNodeList.java:145)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at 
> org.apache.calcite.sql.SqlAsOperator.acceptCall(SqlAsOperator.java:121)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:212)
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4207) Validation fails for positional aggregate with current_date in 'case' expression

2020-09-02 Thread Igor Guzenko (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko updated CALCITE-4207:
--
Description: 
*Steps to reproduce: *
 Add test to SqlValidatorTest.java
{code:java}
  @Test void testPositionalAggregateWithExpandedCurrentDateFunction() {
SqlConformance defaultPlusOrdinalGroupBy =
new SqlDelegatingConformance(SqlConformanceEnum.DEFAULT) {
  @Override public boolean isGroupByOrdinal() {
return true;
  }
};
sql("SELECT HIREDATE >= CURRENT_DATE, COUNT(*) "
+ "FROM EMP GROUP BY 1")
.withConformance(defaultPlusOrdinalGroupBy)
.ok();
  }
{code}

*Expected result*
Validation passed successfully.

*Actual result*

{code:none}org.apache.calcite.sql.validate.SqlValidatorException: Expression 
'EMP.HIREDATE' is not being grouped
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5003)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:113)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:321)
at 
org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql.util.SqlBasicVisitor.visit(SqlBasicVisitor.java:52)
at org.apache.calcite.sql.SqlNodeList.accept(SqlNodeList.java:145)
at 
org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
at 
org.apache.calcite.sql.SqlAsOperator.acceptCall(SqlAsOperator.java:121)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:212)

{code}

  was:
*Steps to reproduce: *
 Add test to SqlValidatorTest.java
{code:java}
  @Test void testPositionalAggregateWithExpandedCurrentDateFunction() {
SqlConformance defaultPlusOrdinalGroupBy =
new SqlDelegatingConformance(SqlConformanceEnum.DEFAULT) {
  @Override
  public boolean isGroupByOrdinal() {
return true;
  }
};
sql("SELECT "
+ "CASE WHEN HIREDATE >= CURRENT_DATE "
+ "THEN 'Forward' "
+ "ELSE 'Actual' END AS fa,"
+ "COUNT(*) "
+ "FROM EMP "
+ "GROUP BY 1")
.withConformance(defaultPlusOrdinalGroupBy)
.ok();
  }
{code}

*Expected result*
Validation passed successfully.

*Actual result*

{code:none}org.apache.calcite.sql.validate.SqlValidatorException: Expression 
'EMP.HIREDATE' is not being grouped
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
at org.apach

[jira] [Created] (CALCITE-4212) Revisit cost-model to break ties between Enumerable and Bindable expressions

2020-09-02 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-4212:


 Summary: Revisit cost-model to break ties between Enumerable and 
Bindable expressions
 Key: CALCITE-4212
 URL: https://issues.apache.org/jira/browse/CALCITE-4212
 Project: Calcite
  Issue Type: Improvement
Reporter: Stamatis Zampetakis


Most Enumerable and Bindable expressions use exactly the same cost function to 
compute cost. Depending on the query this may lead to different equivalent 
(sub) plans with exactly the same cost. This makes the plans dependent on the 
order that the rules are applied. 

Let's consider for example the following query present in 
{{DruidAdapterIT#testProject}}

{code:sql}
select "product_name", 0 as zero
from "foodmart"
order by "product_name";
{code}

At some point during planning the optimizer needs to decide between the 
following plans:

+Choice 1+
{noformat}
EnumerableSort(sort0=[$0], dir0=[ASC]), id = 37
  EnumerableInterpreter(subset=[rel#23:RelSubset#1.ENUMERABLE.[]]), id = 43
DruidQuery(subset=[rel#26:RelSubset#1.BINDABLE.[]], table=[[foodmart, 
foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], 
projects=[[$3, 0]]), id = 25
{noformat}

+Choice 2+
{noformat}
EnumerableInterpreter, id = 61
  BindableSort(subset=[rel#40:RelSubset#1.BINDABLE.[0]], sort0=[$0], 
dir0=[ASC]), id = 41
DruidQuery(subset=[rel#26:RelSubset#1.BINDABLE.[]], table=[[foodmart, 
foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], 
projects=[[$3, 0]]), id = 25
{noformat}

Both choices have exactly the same cost since {{BindableSort}} and 
{{EnumerableSort}} use the same cost function ({{Sort#computeSelfCost}}, 
{{RelMdRowCount#getRowCount(Sort, RelMetadataQuery)}}).

The issue can appear with various other expressions such as Project, SetOp, 
etc. 

Although the example is taken from the Druid adapter the same can happen if 
both Bindable and Enumerable conventions are used during planning in other 
use-cases. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4199) Add nullability annotations to the methods and fields, ensure consistency with checkerframework

2020-09-02 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189089#comment-17189089
 ] 

Vladimir Sitnikov commented on CALCITE-4199:


{quote}Statistic is an SPI. Users can provide their own implementation, and we 
have to deal with it. You can't tighten the contract.{quote}
My point is the current implementation (even without changes) would fail with 
NPE.

The commit was in December: 
https://github.com/apache/calcite/commit/ed181a4671b9941f2852c581daa043d8a2d6dec2#diff-e6287df1618ee07f6944872fdb751628R232-R233

There are two possible NPEs:
1) ImmutableList.copyOf(keys); in getKeys() method
2) {{for (ImmutableBitSet key : keys)}} in getUniqueKeys method

> Add nullability annotations to the methods and fields, ensure consistency 
> with checkerframework
> ---
>
> Key: CALCITE-4199
> URL: https://issues.apache.org/jira/browse/CALCITE-4199
> Project: Calcite
>  Issue Type: New Feature
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> The current codebase uses jsr305 javax.annotation.Nullable / NonNull which 
> helps to catch bugs while developing Calcite and libraries.
> Unfortunately, the current annotation is not enforced, and it lacks support 
> for generics.
> jsr305.jar is probably abandoned (see 
> https://github.com/google/guava/issues/2960), so we should probably migrate 
> to something else.
> https://checkerframework.org/ is a solid framework for machine verification 
> which is tailored to Java.
> The releases are quite frequent: 
> https://github.com/typetools/checker-framework/releases
> Their annotations are recognized by major IDEs.
> So I see the following options:
> a) Leave the code as is
> b) Annotate (gradually?) the code with checkerframework annotations
> c) Migrate (gradually?) the code to Kotlin
> I've created a PR to verify which changes would be needed, verify if CI is 
> able to type check in a reasonable time, and so on: 
> https://github.com/apache/calcite/pull/1929
> My findings regarding checkerframework so far are:
> 0) It does detect NPEs (which were hidden in the current code), and it does 
> detect initialized files in the constructors
> 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} 
> in GitHub Actions CI
> 2) "non-nullable by default" is quite close to the current Calcite 
> conventions.
> 3) There are cases when javadoc comments says "or null", however, the code 
> reads much easier if {{@Nullable}} nullable appears in the signature
> 4) If a third-party library supplies invalid type annotations, there's a way 
> to fix that. For instance, Guava's Function is annotated as "always 
> nullable", and we can overrule that (so the nullability is taken from generic 
> signature rather than "always nullable"). The override files are placed to 
> src/main/config/checkerframework
> 5) Generic-heavy code might be challenging (they are always like that), 
> however, in the most obscure cases there's always a way to suppress the 
> warning
> 6) I've filed a Gradle improvement so it schedules recently modified files 
> first for the compilation: https://github.com/gradle/gradle/issues/14332
> ---
> h2. Option a: leave the code as is
> Typically, Calcite code does have decent documentation if {{null}} is 
> permissible, however, I often need to check the existing calls in order to 
> figure out if {{null}} is expected.
> Note: nullability does not mean just "null vs non-null" value, but it is 
> related to other common pitfalls like "failing to initialize the value in 
> constructor", "calling non-final method in the constructor"
> h2. Option b: annotate (gradually?) the code with checkerframework annotations
> Pros:
> * Checkerframework verifies one method at a time, so the resulting code is 
> easy to reason about. For instance, it requires that {{boolean 
> equals(@Nullable Object other)}} overloads must explicitly declare 
> {{@Nulable}} annotation. That might sound too verbose at first, however, it 
> helps to read the code: you don't need to analyze class/interface 
> declarations in order to figure out if nulls are expected or not.
> * Checkerframework supports nullability with generics. For instance, {{class 
> Wrapper}} means T can be used as either non-nullable or nullable. {{class 
> Wrapper}} means T is always nullable (all implementations can assume T is 
> always nullable), and {{class Wrapper}} (nonnull 
> is redundant in this case and can be omitted) means T can't be null.
> * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} 
> is non-nullable array with nullable elements, {{Object @Nullable []}} is 
> nullable array with non-nullable elements.
> * The verifier sees untested code
> * ch

[jira] [Updated] (CALCITE-4208) Improve metadata row count for Join

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4208:

Labels: pull-request-available  (was: )

> Improve metadata row count for Join
> ---
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhishui reassigned CALCITE-4211:


Assignee: zhishui

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Assignee: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189072#comment-17189072
 ] 

Julian Hyde commented on CALCITE-4211:
--

Sorry I have no time to investigate this.

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4199) Add nullability annotations to the methods and fields, ensure consistency with checkerframework

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189069#comment-17189069
 ] 

Julian Hyde commented on CALCITE-4199:
--

Please don't change anything without review.

Statistic is an SPI. Users can provide their own implementation, and we have to 
deal with it. You can't tighten the contract.

> Add nullability annotations to the methods and fields, ensure consistency 
> with checkerframework
> ---
>
> Key: CALCITE-4199
> URL: https://issues.apache.org/jira/browse/CALCITE-4199
> Project: Calcite
>  Issue Type: New Feature
>Affects Versions: 1.25.0
>Reporter: Vladimir Sitnikov
>Priority: Major
>
> The current codebase uses jsr305 javax.annotation.Nullable / NonNull which 
> helps to catch bugs while developing Calcite and libraries.
> Unfortunately, the current annotation is not enforced, and it lacks support 
> for generics.
> jsr305.jar is probably abandoned (see 
> https://github.com/google/guava/issues/2960), so we should probably migrate 
> to something else.
> https://checkerframework.org/ is a solid framework for machine verification 
> which is tailored to Java.
> The releases are quite frequent: 
> https://github.com/typetools/checker-framework/releases
> Their annotations are recognized by major IDEs.
> So I see the following options:
> a) Leave the code as is
> b) Annotate (gradually?) the code with checkerframework annotations
> c) Migrate (gradually?) the code to Kotlin
> I've created a PR to verify which changes would be needed, verify if CI is 
> able to type check in a reasonable time, and so on: 
> https://github.com/apache/calcite/pull/1929
> My findings regarding checkerframework so far are:
> 0) It does detect NPEs (which were hidden in the current code), and it does 
> detect initialized files in the constructors
> 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} 
> in GitHub Actions CI
> 2) "non-nullable by default" is quite close to the current Calcite 
> conventions.
> 3) There are cases when javadoc comments says "or null", however, the code 
> reads much easier if {{@Nullable}} nullable appears in the signature
> 4) If a third-party library supplies invalid type annotations, there's a way 
> to fix that. For instance, Guava's Function is annotated as "always 
> nullable", and we can overrule that (so the nullability is taken from generic 
> signature rather than "always nullable"). The override files are placed to 
> src/main/config/checkerframework
> 5) Generic-heavy code might be challenging (they are always like that), 
> however, in the most obscure cases there's always a way to suppress the 
> warning
> 6) I've filed a Gradle improvement so it schedules recently modified files 
> first for the compilation: https://github.com/gradle/gradle/issues/14332
> ---
> h2. Option a: leave the code as is
> Typically, Calcite code does have decent documentation if {{null}} is 
> permissible, however, I often need to check the existing calls in order to 
> figure out if {{null}} is expected.
> Note: nullability does not mean just "null vs non-null" value, but it is 
> related to other common pitfalls like "failing to initialize the value in 
> constructor", "calling non-final method in the constructor"
> h2. Option b: annotate (gradually?) the code with checkerframework annotations
> Pros:
> * Checkerframework verifies one method at a time, so the resulting code is 
> easy to reason about. For instance, it requires that {{boolean 
> equals(@Nullable Object other)}} overloads must explicitly declare 
> {{@Nulable}} annotation. That might sound too verbose at first, however, it 
> helps to read the code: you don't need to analyze class/interface 
> declarations in order to figure out if nulls are expected or not.
> * Checkerframework supports nullability with generics. For instance, {{class 
> Wrapper}} means T can be used as either non-nullable or nullable. {{class 
> Wrapper}} means T is always nullable (all implementations can assume T is 
> always nullable), and {{class Wrapper}} (nonnull 
> is redundant in this case and can be omitted) means T can't be null.
> * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} 
> is non-nullable array with nullable elements, {{Object @Nullable []}} is 
> nullable array with non-nullable elements.
> * The verifier sees untested code
> * checkerframework has a coherent set of verifiers, and it is pluggable
> * checkerframework compiler understand lots of different nullability 
> annotations, so we could use jsr305 nullability annotations instead of the 
> ones from checkerframework
> * Nullability annotations might make it easier to contribute: one has to 
> reason about nulls anyway, and the annotati

[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189065#comment-17189065
 ] 

zhishui commented on CALCITE-4211:
--

I hope to get your confirm and I have an idea to resolve it if this is quite a 
serious issue

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189062#comment-17189062
 ] 

zhishui commented on CALCITE-4211:
--

when I run this test, it's failing, I this there is no rule about MutaralSort 

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4207) Validation fails for positional aggregate with current_date in 'case' expression

2020-09-02 Thread Igor Guzenko (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189063#comment-17189063
 ] 

Igor Guzenko commented on CALCITE-4207:
---

In one place CURRENT_DATE was represented as SqlBasicCall while in other it was 
instance of SqlIdentifier, that's why the keyword is necessary for the test. 

> Validation fails for positional aggregate with current_date in 'case' 
> expression
> 
>
> Key: CALCITE-4207
> URL: https://issues.apache.org/jira/browse/CALCITE-4207
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce: *
>  Add test to SqlValidatorTest.java
> {code:java}
>   @Test void testPositionalAggregateWithExpandedCurrentDateFunction() {
> SqlConformance defaultPlusOrdinalGroupBy =
> new SqlDelegatingConformance(SqlConformanceEnum.DEFAULT) {
>   @Override
>   public boolean isGroupByOrdinal() {
> return true;
>   }
> };
> sql("SELECT "
> + "CASE WHEN HIREDATE >= CURRENT_DATE "
> + "THEN 'Forward' "
> + "ELSE 'Actual' END AS fa,"
> + "COUNT(*) "
> + "FROM EMP "
> + "GROUP BY 1")
> .withConformance(defaultPlusOrdinalGroupBy)
> .ok();
>   }
> {code}
> *Expected result*
> Validation passed successfully.
> *Actual result*
> {code:none}org.apache.calcite.sql.validate.SqlValidatorException: Expression 
> 'EMP.HIREDATE' is not being grouped
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>   at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5003)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:113)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:321)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor.visit(SqlBasicVisitor.java:52)
>   at org.apache.calcite.sql.SqlNodeList.accept(SqlNodeList.java:145)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at 
> org.apache.calcite.sql.SqlAsOperator.acceptCall(SqlAsOperator.java:121)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:212)
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4209) RelBuilder: ability to avoid generating empty() on limit 0

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189061#comment-17189061
 ] 

Julian Hyde commented on CALCITE-4209:
--

See comments in {{RelBuilder.empty()}}. We thought of this case.

> RelBuilder: ability to avoid generating empty() on limit 0
> --
>
> Key: CALCITE-4209
> URL: https://issues.apache.org/jira/browse/CALCITE-4209
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Steven Talbot
>Priority: Major
>
> When you call
> {code:java}
> relBuilder.limit(0, 0){code}
> It triggers a call to RelBuilder.empty at 
> [https://github.com/apache/calcite/blob/88d18185e6177c9df587bdd23dd4049f59adc2e4/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2531].
> This is fine as the default behavior, but for cases where we intend to 
> convert the relational algebra to SQL and display it somewhere, it would be 
> preferable to turn this behavior off, either as an extra flag to this method 
> or a configuration parameter. The call to empty() often results in a VALUES 
> or a select list of literal NULLs, which looks ugly and can confuse a user.
> Moreover, there are certain databases (like BigQuery) that can cheaply handle 
> a LIMIT 0 query as a form of validation, and if we munge the query with 
> 'empty()' we lose the ability to validate its correctness by going to the DB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189058#comment-17189058
 ] 

Julian Hyde commented on CALCITE-4211:
--

Is CI failing?

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4207) Validation fails for positional aggregate with current_date in 'case' expression

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189053#comment-17189053
 ] 

Julian Hyde commented on CALCITE-4207:
--

Can you devise a minimal test case? I doubt that {{CURRENT_DATE}} is essential 
to reproduce this problem.

> Validation fails for positional aggregate with current_date in 'case' 
> expression
> 
>
> Key: CALCITE-4207
> URL: https://issues.apache.org/jira/browse/CALCITE-4207
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce: *
>  Add test to SqlValidatorTest.java
> {code:java}
>   @Test void testPositionalAggregateWithExpandedCurrentDateFunction() {
> SqlConformance defaultPlusOrdinalGroupBy =
> new SqlDelegatingConformance(SqlConformanceEnum.DEFAULT) {
>   @Override
>   public boolean isGroupByOrdinal() {
> return true;
>   }
> };
> sql("SELECT "
> + "CASE WHEN HIREDATE >= CURRENT_DATE "
> + "THEN 'Forward' "
> + "ELSE 'Actual' END AS fa,"
> + "COUNT(*) "
> + "FROM EMP "
> + "GROUP BY 1")
> .withConformance(defaultPlusOrdinalGroupBy)
> .ok();
>   }
> {code}
> *Expected result*
> Validation passed successfully.
> *Actual result*
> {code:none}org.apache.calcite.sql.validate.SqlValidatorException: Expression 
> 'EMP.HIREDATE' is not being grouped
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>   at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5003)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:113)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:321)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor.visit(SqlBasicVisitor.java:52)
>   at org.apache.calcite.sql.SqlNodeList.accept(SqlNodeList.java:145)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
>   at 
> org.apache.calcite.sql.SqlAsOperator.acceptCall(SqlAsOperator.java:121)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
>   at org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:212)
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4208) Improve metadata row count for Join

2020-09-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189054#comment-17189054
 ] 

Ruben Q L commented on CALCITE-4208:


I have this new proposal:
{code:java}
selectivity = mq.getSelectivity(join, condition);
innerRowCount = left * right * selectivity; // unchanged
leftRowCount = left * (1D - selectivity) + innerRowCount;
rightRowCount = right * (1D - selectivity) + innerRowCount;
fullRowCount = left * (1D - selectivity) + right * (1D - selectivity) + 
innerRowCount = (left + right) * (1D - selectivity) + innerRowCount;
{code}
which would seem more accurate than the current computations (but nothing too 
revolutionary). It does not cause much regression on our tests (just a couple 
of them would need to get adjusted)

> Improve metadata row count for Join
> ---
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189056#comment-17189056
 ] 

zhishui commented on CALCITE-4211:
--

yeah, it's MaterializedViewSubstitutionVisitorTest.

testOrderByQueryOnOrderByView

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189050#comment-17189050
 ] 

Julian Hyde commented on CALCITE-4211:
--

Is this test in Calcite or elsewhere?

> QueryOnOrderByView could not pass test
> --
>
> Key: CALCITE-4211
> URL: https://issues.apache.org/jira/browse/CALCITE-4211
> Project: Calcite
>  Issue Type: Bug
>Reporter: zhishui
>Priority: Critical
>
> sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
> "select \"empid\" from \"emps\" order by \"deptno\"")
> .ok();
> in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4211) QueryOnOrderByView could not pass test

2020-09-02 Thread zhishui (Jira)
zhishui created CALCITE-4211:


 Summary: QueryOnOrderByView could not pass test
 Key: CALCITE-4211
 URL: https://issues.apache.org/jira/browse/CALCITE-4211
 Project: Calcite
  Issue Type: Bug
Reporter: zhishui


sql("select \"deptno\", \"empid\" from \"emps\" order by \"deptno\"",
"select \"empid\" from \"emps\" order by \"deptno\"")
.ok();

in lastest verison, this test could not pass correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4208) Improve metadata row count for Join

2020-09-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188784#comment-17188784
 ] 

Ruben Q L edited comment on CALCITE-4208 at 9/2/20, 7:44 AM:
-

Thanks for your feedback [~amaliujia].
To be honest I am not quite sure about FULL/LEFT/RIGHT modifications. I am sure 
though that the ANTI computation must be fixed.


was (Author: rubenql):
Thanks for your feedback [~amaliujia].
To be honest I am not quite sure about INNER/LEFT/RIGHT modifications. I am 
sure though that the ANTI computation must be fixed.

> Improve metadata row count for Join
> ---
>
> Key: CALCITE-4208
> URL: https://issues.apache.org/jira/browse/CALCITE-4208
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Currently, the default metadata row count for join 
> {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on 
> {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same 
> formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4131) ERROR message in SqlToRelConverterExtendedTest

2020-09-02 Thread Jiatao Tao (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189023#comment-17189023
 ] 

Jiatao Tao commented on CALCITE-4131:
-

Hi Haisheng, how do these errors occur? Can you give a reproduce code?

> ERROR message in SqlToRelConverterExtendedTest 
> ---
>
> Key: CALCITE-4131
> URL: https://issues.apache.org/jira/browse/CALCITE-4131
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> The error doesn't affect the proceeding of the test suite, and doesn't fail 
> the test. But the error message is not so pleasing:
> {code:java}
> SqlToRelConverterExtendedTest > testCrossJoinLateral1() STANDARD_ERROR
> [Fatal Error] :1:2: XML document structures must start and end within the 
> same entity.
> ERROR:  'Could not compile stylesheet'
> FATAL ERROR:  'XML document structures must start and end within the same 
> entity.'
>:XML document structures must start and end within the same 
> entity.
> SqlToRelConverterExtendedTest > testMatchRecognizeIn() STANDARD_ERROR
> ERROR:  'XML document structures must start and end within the same 
> entity.'
> ERROR:  'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: 
> XML document structures must start and end within the same entity.'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)