[jira] [Created] (CALCITE-6569) RelToSqlConverter support IGNORE NULLS for window functions
Jiajun Xie created CALCITE-6569: --- Summary: RelToSqlConverter support IGNORE NULLS for window functions Key: CALCITE-6569 URL: https://issues.apache.org/jira/browse/CALCITE-6569 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.20.0 Reporter: Jiajun Xie Assignee: Jiajun Xie CALCITE-883(https://issues.apache.org/jira/browse/CALCITE-883) Support IGNORE NULLS option for LEAD, LAG, FIRST_VALUE, LAST_VALUE and NTH_VALUE window functions. But RelToSqlConverter not support them, IGNORE NULLS will be lost. For example {code:java} SELECT LAG("employee_id", 2) IGNORE NULLS OVER (ORDER BY "hire_date") FROM "employee"{code} It lost `IGNORE NULLS ` {code:java} SELECT LAG("employee_id", 2) OVER (ORDER BY "hire_date") FROM "employee"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6318) Add some rules that WeTune discovers and verifies
Jiajun Xie created CALCITE-6318: --- Summary: Add some rules that WeTune discovers and verifies Key: CALCITE-6318 URL: https://issues.apache.org/jira/browse/CALCITE-6318 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie [WeTune|https://ipads.se.sjtu.edu.cn/_media/publications/wtune_extend.pdf] is a rule generator that can automatically discover new query rewrite rules for SQL query optimization. The paper was published in 2022, and some optimization rules have recently been implemented in Calcite. e.g. CALCITE-6214 is rule No.2 in Table.7 of the paper. But there are still many rules that have not been implemented, which are worth discussing. Here are all the discovered rewrite [rules|https://ipads.se.sjtu.edu.cn:1312/opensource/wetune/-/blob/main/wtune_data/rules/rules.test.txt?ref_type=heads]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6240) Removed ROW_NUMBER if OVER PARTITION BY unique key
Jiajun Xie created CALCITE-6240: --- Summary: Removed ROW_NUMBER if OVER PARTITION BY unique key Key: CALCITE-6240 URL: https://issues.apache.org/jira/browse/CALCITE-6240 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie The SQL Only have ROW_NUMBER function. {code:java} SELECT empno, ename FROM ( SELECT ROW_NUMBER() OVER (PARTITION BY empno) rn, empno, ename FROM emp ) WHERE rn <= 5{code} to {code:java} SELECT empno, ename FROM emp{code} If we know every partition count is less than filter count, we can optimize {code:java} SELECT deptno, ename FROM ( SELECT ROW_NUMBER() OVER (PARTITION BY deptno) rn, deptno, ename FROM emp ) WHERE rn <= N -- Statistic let we know every department count is less than N{code} to {code:java} SELECT empno, ename FROM emp {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6214) Remove `DISTINCT` in `COUNT` if field is unique
Jiajun Xie created CALCITE-6214: --- Summary: Remove `DISTINCT` in `COUNT` if field is unique Key: CALCITE-6214 URL: https://issues.apache.org/jira/browse/CALCITE-6214 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie For the sql {code:java} select count(distinct x) cnt from( select distinct sal x from emp ) t {code} The distinct that in count can be removed. {code:java} LogicalAggregate(group=[{}], CNT=[COUNT($0)]) LogicalAggregate(group=[{0}]) LogicalProject(X=[$5]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} But `CoreRules#AGGREGATE_REMOVE` not support it, so there are two DISTINCT. {code:java} LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $0)]) LogicalAggregate(group=[{0}]) LogicalProject(X=[$5]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6201) Merge Aggregate node if count distinct column has been deduplicated
Jiajun Xie created CALCITE-6201: --- Summary: Merge Aggregate node if count distinct column has been deduplicated Key: CALCITE-6201 URL: https://issues.apache.org/jira/browse/CALCITE-6201 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie distinct column is redundant because count has distinct. {code:java} select count(distinct x) cnt from( select distinct sal x from emp ) t {code} PlanBefore is {noformat} LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $0)]) LogicalAggregate(group=[{5}]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]){noformat} PlanAfter should be {noformat} LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $5)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]){noformat} But `CoreRules.AGGREGATE_MERGE` not support it, we can improve it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6200) RelJson throw UnsupportedOperationException for RexDynamicParam
Jiajun Xie created CALCITE-6200: --- Summary: RelJson throw UnsupportedOperationException for RexDynamicParam Key: CALCITE-6200 URL: https://issues.apache.org/jira/browse/CALCITE-6200 Project: Calcite Issue Type: Bug Reporter: Jiajun Xie Assignee: Jiajun Xie Dump plan for the sql `select * from emp limit ?`. RelJson will throw the exception: {code:java} unknown rex ?0 java.lang.UnsupportedOperationException: unknown rex ?0 at org.apache.calcite.rel.externalize.RelJson.toJson(RelJson.java:649) at org.apache.calcite.rel.externalize.RelJson.toJson(RelJson.java:455) at org.apache.calcite.rel.externalize.RelJsonWriter.put(RelJsonWriter.java:102) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:85) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:135) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:253) at org.apache.calcite.plan.RelOptUtil.dumpPlan(RelOptUtil.java:2109) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6199) Trim unused fields for SNAPSHOT and SAMPLE if table has VIRTUAL column
Jiajun Xie created CALCITE-6199: --- Summary: Trim unused fields for SNAPSHOT and SAMPLE if table has VIRTUAL column Key: CALCITE-6199 URL: https://issues.apache.org/jira/browse/CALCITE-6199 Project: Calcite Issue Type: Improvement Reporter: Jiajun Xie Assignee: Jiajun Xie RelTri not support SNAPSHOT( So do Sample) * Here is a SQL example {code:java} select D, E from VIRTUALCOLUMNS.VC_T1 for system_time as of TIMESTAMP '2011-01-02 00:00:00' {code} * Before Trimming {code:java} LogicalProject(D=[$3], E=[$4]) LogicalSnapshot(period=[2011-01-02 00:00:00]) LogicalProject(A=[$0], B=[$1], C=[$2], D=[$3], $f4=[+($0, 1)]) LogicalTableScan(table=[[CATALOG, VIRTUALCOLUMNS, VC_T1]]){code} * After Trimming {code:java} LogicalSnapshot(period=[2011-01-02 00:00:00]) LogicalProject(D=[$3], $f4=[+($0, 1)]) LogicalTableScan(table=[[CATALOG, VIRTUALCOLUMNS, VC_T1]]){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6191) Remove join if input is one row literal
Jiajun Xie created CALCITE-6191: --- Summary: Remove join if input is one row literal Key: CALCITE-6191 URL: https://issues.apache.org/jira/browse/CALCITE-6191 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie Here are some Join Rules {noformat} at CoreRules.java | grep ' #JOIN_' * @see #JOIN_REDUCE_EXPRESSIONS * @see #JOIN_SUB_QUERY_TO_CORRELATE */ * @see #JOIN_SUB_QUERY_TO_CORRELATE */ * @see #JOIN_TO_SEMI_JOIN */ /** As {@link #JOIN_COMMUTE} but swaps outer joins as well as inner joins. */ /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but only the left input is /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but only the right input is /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but match outer as well as /** As {@link #JOIN_PROJECT_LEFT_TRANSPOSE} but match outer as well as /** As {@link #JOIN_PROJECT_RIGHT_TRANSPOSE} but match outer as well as * @see #JOIN_TO_MULTI_JOIN {noformat} I cannot optimize the SQL by using them. {noformat} SELECT EMPNO FROM emp, (SELECT 1 as c) temp WHERE EMPNO = temp.c{noformat} Here is optimized RelNode that I want get. {code:java} LogicalProject(EMPNO=[$0]) LogicalFilter(condition=[=($0, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]){code} But now it is {noformat} LogicalProject(EMPNO=[$0]) LogicalFilter(condition=[=($0, $9)]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalValues(tuples=[[{ 1 }]]){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6093) Unstable unit tes in Druid tests
Jiajun Xie created CALCITE-6093: --- Summary: Unstable unit tes in Druid tests Key: CALCITE-6093 URL: https://issues.apache.org/jira/browse/CALCITE-6093 Project: Calcite Issue Type: Improvement Components: druid-adapter Reporter: Jiajun Xie FAILURE 38.6sec, 235 completed, 1 failed, 6 skipped, org.apache.calcite.test.DruidAdapterIT {code:java} FAILURE 0.2sec, org.apache.calcite.test.DruidAdapter2IT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() 285java.lang.AssertionError: 286Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3" 287 but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076" {code} FAILURE 39.6sec, 453 completed, 2 failed, 12 skipped, Gradle Test Run :druid:test {code:java} FAILURE 28.7sec, 210 completed, 1 failed, 6 skipped, org.apache.calcite.test.DruidAdapter2IT 308org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker 331FAILURE 0.3sec, org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() 332 java.lang.AssertionError: 333 Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3" 334 but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076" {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6091) Char that in array is truncated in CASE WHEN statement
Jiajun Xie created CALCITE-6091: --- Summary: Char that in array is truncated in CASE WHEN statement Key: CALCITE-6091 URL: https://issues.apache.org/jira/browse/CALCITE-6091 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.35.0, 1.34.0 Reporter: Jiajun Xie Assignee: Jiajun Xie -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5944) RelMdRowCount can return more accurate rowCount
Jiajun Xie created CALCITE-5944: --- Summary: RelMdRowCount can return more accurate rowCount Key: CALCITE-5944 URL: https://issues.apache.org/jira/browse/CALCITE-5944 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5943) RelMdSelectivity can return more precise selectivity for Sample
Jiajun Xie created CALCITE-5943: --- Summary: RelMdSelectivity can return more precise selectivity for Sample Key: CALCITE-5943 URL: https://issues.apache.org/jira/browse/CALCITE-5943 Project: Calcite Issue Type: Bug Reporter: Jiajun Xie Assignee: Jiajun Xie -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5805) SelectNamespace throws NullPointerException while validating MERGE statement
Jiajun Xie created CALCITE-5805: --- Summary: SelectNamespace throws NullPointerException while validating MERGE statement Key: CALCITE-5805 URL: https://issues.apache.org/jira/browse/CALCITE-5805 Project: Calcite Issue Type: Bug Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie {code:java} final String sql = "merge into empnullables e " + "using (select * from emp where deptno is null) t " + "on e.empno = t.empno " + "when matched then update " + "set ename = t.ename, deptno = t.deptno, sal = t.sal * .1 " + "when not matched then insert (empno, ename, deptno, sal) " + "values(t.empno, t.ename, 10, t.sal * .15)"; sql(sql).ok(); // Expected it is ok, but failed{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5796) DELETE statement is not effective
Jiajun Xie created CALCITE-5796: --- Summary: DELETE statement is not effective Key: CALCITE-5796 URL: https://issues.apache.org/jira/browse/CALCITE-5796 Project: Calcite Issue Type: Bug Components: core, linq4j Environment: I try to use `DELETE FROM t`, but the table was not affected. {code:java} static void erase(SqlIdentifier name, CalcitePrepare.Context context) { // Generate, prepare and execute an "DELETE FROM table" statement. // (It's a bit inefficient that we convert from SqlNode to SQL and back // again.) final FrameworkConfig config = Frameworks.newConfigBuilder() .defaultSchema(context.getRootSchema().plus()) .build(); final Planner planner = Frameworks.getPlanner(config); try { final StringBuilder buf = new StringBuilder(); final SqlWriterConfig writerConfig = SqlPrettyWriter.config().withAlwaysUseParentheses(false); final SqlPrettyWriter w = new SqlPrettyWriter(writerConfig, buf); buf.append("DELETE FROM "); name.unparse(w, 0, 0); final String sql = buf.toString(); final SqlNode query1 = planner.parse(sql); final SqlNode query2 = planner.validate(query1); final RelRoot r = planner.rel(query2); final PreparedStatement prepare = context.getRelRunner().prepareStatement(r.rel); int rowCount = prepare.executeUpdate(); Util.discard(rowCount); prepare.close(); } catch (SqlParseException | ValidationException | RelConversionException | SQLException e) { throw Util.throwAsRuntime(e); } } {code} Because we use `EnumerableDefaults#remove()` to delete target rows. The rows is object array, they can't match. !image-2023-06-24-18-27-57-722.png! Reporter: Jiajun Xie Attachments: image-2023-06-24-18-27-57-722.png -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5630) Window with rows equivalence error in volcano planner
Jiajun Xie created CALCITE-5630: --- Summary: Window with rows equivalence error in volcano planner Key: CALCITE-5630 URL: https://issues.apache.org/jira/browse/CALCITE-5630 Project: Calcite Issue Type: Bug Reporter: Jiajun Xie Here is UT {code:java} @Test void testUnionWindow() { String selectSqlBase = "SELECT 'col{?}' as col, sum(\"salary\" ) OVER(" + "PARTITION BY \"deptno\" " + "ORDER BY \"empid\" " + "ROWS BETWEEN CURRENT ROW AND {?} FOLLOWING) " + "FROM \"hr\".\"emps\""; String selectSql1 = selectSqlBase.replace("{?}", "1"); // window1 String selectSql2 = selectSqlBase.replace("{?}", "2"); // window2 String sql = selectSql1 + "\nunion all\n" + selectSql2; CalciteAssert.hr().query(sql).runs() .returns("COL=col1; EXPR$1=8000.0\n" + "COL=col1; EXPR$1=21500.0\n" + "COL=col1; EXPR$1=18500.0\n" + "COL=col1; EXPR$1=7000.0\n" + "COL=col2; EXPR$1=8000.0\n" + "COL=col2; EXPR$1=21500.0\n" + "COL=col2; EXPR$1=18500.0\n" + "COL=col2; EXPR$1=7000.0\n"); // sum that for col1 is same as sum that for col2, this is error } {code} Because RelDigest is same between window1 and window2, volcano planner use window1 replace window2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5597) Column that be order by should not be aggregate
Jiajun Xie created CALCITE-5597: --- Summary: Column that be order by should not be aggregate Key: CALCITE-5597 URL: https://issues.apache.org/jira/browse/CALCITE-5597 Project: Calcite Issue Type: Bug Components: core Reporter: Jiajun Xie Here is unit test {code:java} @Test void testDistinctOrderByRand() { final String sql = "select distinct deptno from emp order by rand()"; sql(sql).ok(); } {code} We will get error logical plan {code:java} LogicalProject(DEPTNO=[$0]) LogicalSort(sort0=[$1], dir0=[ASC]) LogicalAggregate(group=[{0, 1}]) //rand() in group, result will be error LogicalProject(DEPTNO=[$7], EXPR$1=[RAND()]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5590) NullPointerException when converting 'in' expression that is used inside select list and group by
Jiajun Xie created CALCITE-5590: --- Summary: NullPointerException when converting 'in' expression that is used inside select list and group by Key: CALCITE-5590 URL: https://issues.apache.org/jira/browse/CALCITE-5590 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.33.0 Reporter: Jiajun Xie In CALCITE-5209, we changed sub-query handle way. It can solve some problems, but it also introduces some new problems. Here is a simple case to reproduce, it can work before 1.33.0 {code:java} @Test void testGroupByCaseWhen() { final String sql = "select case when deptno in (1) then 1 end as col\n" + "from emp\n" + "group by deptno, case when deptno in (1) then 1 else 0 end"; sql(sql).ok(); } {code} {code:java} while converting CASE WHEN `EMP`.`DEPTNO` IN (1) THEN 1 ELSE NULL END java.lang.RuntimeException: while converting CASE WHEN `EMP`.`DEPTNO` IN (1) THEN 1 ELSE NULL END at org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:86) at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5591) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4875) at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:5454) at org.apache.calcite.sql2rel.StandardConvertletTable.lambda$new$9(StandardConvertletTable.java:205) at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5591) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4875) at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:5454) at org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:3316) at org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:3158) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:784) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:682) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3680) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:597) at org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertSqlToRel(SqlToRelTestBase.java:639) at org.apache.calcite.test.SqlToRelTestBase$TesterImpl.assertConvertsTo(SqlToRelTestBase.java:758) at org.apache.calcite.test.SqlToRelConverterTest$Sql.convertsTo(SqlToRelConverterTest.java:4502) at org.apache.calcite.test.SqlToRelConverterTest$Sql.ok(SqlToRelConverterTest.java:4494) at org.apache.calcite.test.SqlToRelConverterTest.testGroupByCaseWhen(SqlToRelConverterTest.java:4028){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5525) GROUPING_ID() should be rewritten to GROUPING() in some dialects
Jiajun Xie created CALCITE-5525: --- Summary: GROUPING_ID() should be rewritten to GROUPING() in some dialects Key: CALCITE-5525 URL: https://issues.apache.org/jira/browse/CALCITE-5525 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie Some databases not exists `grouping_id()`, `grouping()` is used as `grouping_id()`. - Postgresql {code:java} ERROR: function grouping_id(integer, character varying) does not exist LINE 1: select grouping(id), grouping(name), grouping_id(id, name), ... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts {code} - Presto {code:java} Function grouping_id not registered {code} Here is a simple query, `grouping_id()` is a legal function in Spark. {code:java} select a, b, count(*), grouping(a) ga, grouping(b) gb, grouping_id(a, b) from ( select 1 as a, 2 as b ) group by grouping sets((a, b), (b), ()) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5507) HAVING alias failed when aggregate function in condition
Jiajun Xie created CALCITE-5507: --- Summary: HAVING alias failed when aggregate function in condition Key: CALCITE-5507 URL: https://issues.apache.org/jira/browse/CALCITE-5507 Project: Calcite Issue Type: Improvement Reporter: Jiajun Xie We know that calcite can support HAVING alias by setting SqlConformanceEnum.LENIENT {code:java} sql("select count(empno) as e from emp having e > 10") .withConformance(lenient).ok(); {code} but when I add one aggregate function in HAVING clause, it will fail. {code:java} sql("select count(empno) as e from emp having ^e^ > 10 and count(empno) > 10 ") .withConformance(lenient).fails("Column 'E' not found in any table"); // I think it should be ok{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5506) RelToSqlConverter get error result because RelFieldTrimmer lost aggregate function
Jiajun Xie created CALCITE-5506: --- Summary: RelToSqlConverter get error result because RelFieldTrimmer lost aggregate function Key: CALCITE-5506 URL: https://issues.apache.org/jira/browse/CALCITE-5506 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Here is a sql {code:java} select l.v as l_cost from ( select 0 as v, 1 as k ) l join ( select sum("cost") as v, 1 as k from ( select 1 as "cost" union select 2 as "cost" ) ) r on l.k = r.k {code} Before trimming, the RelNode is {code:java} LogicalProject(L_COST=[$0]) LogicalJoin(condition=[=($1, $3)], joinType=[inner]) LogicalValues(tuples=[[{ 0, 1 }]]) LogicalProject(V=[$0], K=[1]) LogicalAggregate(group=[{}], V=[SUM($0)]) LogicalUnion(all=[false]) LogicalValues(tuples=[[{ 1 }]]) LogicalValues(tuples=[[{ 2 }]]) {code} After trimming, the RelNode is {code:java} LogicalProject(L_COST=[$0]) LogicalJoin(condition=[=($1, $2)], joinType=[inner]) LogicalValues(tuples=[[{ 0, 1 }]]) LogicalProject(K=[1]) LogicalAggregate(group=[{}], DUMMY=[COUNT()]) LogicalUnion(all=[false]) LogicalValues(tuples=[[{ 1 }]]) LogicalValues(tuples=[[{ 2 }]]){code} If we convert trimmed RelNode to sql, the sql will be {code:java} SELECT * FROM (VALUES (0, 1)) AS "t" ("V", "K") INNER JOIN (SELECT 1 AS "K" -- Missing SUM() FROM (SELECT * FROM (VALUES (1)) AS "t" ("cost") UNION SELECT * FROM (VALUES (2)) AS "t" ("cost")) AS "t2" ) AS "t4" ON "t"."K" = "t4"."K" {code} The origin sql only has one row result, but the new sql that be trimmed has two row result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5486) Subquery not support having alias in condition.
Jiajun Xie created CALCITE-5486: --- Summary: Subquery not support having alias in condition. Key: CALCITE-5486 URL: https://issues.apache.org/jira/browse/CALCITE-5486 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie {code:java} sql("select * from emp where sal >\n" + " (select avg(sal) as s" + "from emp having ^s^ > 0" + " )") .withConformance(SqlConformanceEnum.LENIENT) .fails("Column 'S' not found in any table");{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5392) Support Snapshot in RelMdExpressionLineage
Jiajun Xie created CALCITE-5392: --- Summary: Support Snapshot in RelMdExpressionLineage Key: CALCITE-5392 URL: https://issues.apache.org/jira/browse/CALCITE-5392 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie For example: {code:java} select productid from products_temporal for system_time as of TIMESTAMP '2022-11-19 00:00:00' {code} The `productid` expression lineage is null because `RelMdExpressionLineage` does not implement the corresponding method. We can implement method that for SNAPSHOT like other RelNode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5205) Supports hint option as string and numeric literal
Jiajun Xie created CALCITE-5205: --- Summary: Supports hint option as string and numeric literal Key: CALCITE-5205 URL: https://issues.apache.org/jira/browse/CALCITE-5205 Project: Calcite Issue Type: Improvement Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie Spark support hint that options contain string and number: {code:java} SELECT /*+ REPARTITION(3, c) */ * FROM t {code} But calcite can't parse it: {code:java} Error while parsing SQL: select /*+ repartition(3, empno) */ empno, ename, deptno from emps java.lang.RuntimeException: Error while parsing SQL: select /*+ repartition(3, empno) */ empno, ename, deptno from emps {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5164) Planner#parser can't parse TIMESTAMP() function
Jiajun Xie created CALCITE-5164: --- Summary: Planner#parser can't parse TIMESTAMP() function Key: CALCITE-5164 URL: https://issues.apache.org/jira/browse/CALCITE-5164 Project: Calcite Issue Type: Bug Components: babel Affects Versions: 1.30.0 Reporter: Jiajun Xie Assignee: Jiajun Xie Both core and babel will parse fail. {code:java} FrameworkConfig coreConfig = Frameworks.newConfigBuilder().build(); Planner corePlanner = Frameworks.getPlanner(coreConfig); corePlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'"); // Caused by: org.apache.calcite.sql.parser.babel.ParseException: Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8. FrameworkConfig babelConfig = Frameworks.newConfigBuilder() .parserConfig(SqlParser.Config.DEFAULT.withParserFactory( SqlBabelParserImpl.FACTORY)) .build(); Planner babelPlanner = Frameworks.getPlanner(babelConfig); babelPlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'"); // Caused by: org.apache.calcite.sql.parser.babel.ParseException: Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8 {code} Here are some databases that support TIMESTAMP function. - MySQL: [https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_timestamp] {code:java} select timestamp('2022-05-21 08:00:00') // result timestamp('2022-05-21 08:00:00') 2022-05-21 08:00:00 {code} - Derby: [https://docs.oracle.com/javadb/10.6.2.1/ref/rreftimestampfunc.html] Also, here are some databases that not support TIMESTAMP function: - Oracle: [https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm] {code:java} select timestamp('2022-05-21 08:00:00') //ORA-00923: FROM keyword not found where expected {code} - SQL Server: [https://docs.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?view=sql-server-ver15] {code:java} select timestamp('2022-05-21 08:00:00') // Msg 195 Level 15 State 10 Line 1 // 'timestamp' is not a recognized built-in function name.{code} Is it necessary for us to support it in babel module? -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (CALCITE-5141) Implicit type conversion incomplete for insert values
Jiajun Xie created CALCITE-5141: --- Summary: Implicit type conversion incomplete for insert values Key: CALCITE-5141 URL: https://issues.apache.org/jira/browse/CALCITE-5141 Project: Calcite Issue Type: Bug Components: core Reporter: Jiajun Xie Assignee: Jiajun Xie - Target table schema {code:java} // Register "T1" table. final MockTable t1 = MockTable.create(this, tSchema, "T1", false, 7.0, null); t1.addColumn("t1_varchar20", f.varchar20Type, true); t1.addColumn("t1_smallint", f.smallintType); t1.addColumn("t1_int", f.intType); t1.addColumn("t1_bigint", f.bigintType); t1.addColumn("t1_float", f.floatType); t1.addColumn("t1_double", f.doubleType); t1.addColumn("t1_decimal", f.decimalType); t1.addColumn("t1_timestamp", f.timestampType); t1.addColumn("t1_date", f.dateType); t1.addColumn("t1_binary", f.binaryType); t1.addColumn("t1_boolean", f.booleanType); registerTable(t1); {code} - Insert query {code:java} insert into t1 values ('a', 1, 1.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', x'0A', false), ('b', 2, 2, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', x'0A', false), ('c', CAST(3 AS SMALLINT), 3.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', x'0A', false), ('d', 4, 4.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', x'0A', false), ('e', 5, 5.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', x'0A', false) {code} - Incorrect converted plan: the data type is not matched with table schema {code:java} LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], flattened=[false]) LogicalValues(tuples=[[ { 'a', 1, 1, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'b', 2, 2, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'c', 3, 3.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'd', 4, 4.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'e', 5, 5.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }]]) {code} - Correct converted plan {code:java} LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], flattened=[false]) LogicalValues(tuples=[[ { 'a', 1, 1, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'b', 2, 2, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'c', 3, 3, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'd', 4, 4, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, { 'e', 5, 5, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }]]) {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (CALCITE-5091) RelMdRowCount can return more accurate rowCount when fetch is deterministic and offset is dynamic
Jiajun Xie created CALCITE-5091: --- Summary: RelMdRowCount can return more accurate rowCount when fetch is deterministic and offset is dynamic Key: CALCITE-5091 URL: https://issues.apache.org/jira/browse/CALCITE-5091 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.30.0 Reporter: Jiajun Xie Assignee: Jiajun Xie The sql from https://issues.apache.org/jira/browse/CALCITE-5048 {code:java} select * from emp order by ename limit 1 offset ? {code} *RelMdRowCount#getRowCount* will return 14d, because it always return rowCount when offset is dynamic. {code:java} if (rel.offset instanceof RexDynamicParam) { return rowCount; } {code} Obviously, rowCount will not exceed 1 because* LIMIT 1*. We can improve it. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (CALCITE-5013) Parentheses should be reserved when the operand of SetOp has limit or offset
Jiajun Xie created CALCITE-5013: --- Summary: Parentheses should be reserved when the operand of SetOp has limit or offset Key: CALCITE-5013 URL: https://issues.apache.org/jira/browse/CALCITE-5013 Project: Calcite Issue Type: Bug Components: core Reporter: Jiajun Xie - In standard SQL, the operand of union should not have limit or order by.So parse will fail in SqlParserTetest#testLimitUnion and SqlParserTetest#OrderUnion. - When users use parentheses, most engines allow it. See the discussion:[CALCITE-1892|https://issues.apache.org/jira/browse/CALCITE-1892]. -- This message was sent by Atlassian Jira (v8.20.1#820001)