[jira] [Created] (CALCITE-6569) RelToSqlConverter support IGNORE NULLS for window functions

2024-09-10 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6569:
---

 Summary: RelToSqlConverter support IGNORE NULLS for window 
functions
 Key: CALCITE-6569
 URL: https://issues.apache.org/jira/browse/CALCITE-6569
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.20.0
Reporter: Jiajun Xie
Assignee: Jiajun Xie


CALCITE-883(https://issues.apache.org/jira/browse/CALCITE-883) Support  IGNORE 
NULLS option for LEAD, LAG, FIRST_VALUE, LAST_VALUE and NTH_VALUE window 
functions.

But RelToSqlConverter not support them, IGNORE NULLS will be lost.

 

For example

 
{code:java}
SELECT LAG("employee_id", 2) IGNORE NULLS OVER (ORDER BY "hire_date") 
FROM "employee"{code}
 

It lost `IGNORE NULLS `
{code:java}
SELECT LAG("employee_id", 2) OVER (ORDER BY "hire_date") 
FROM "employee"{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6318) Add some rules that WeTune discovers and verifies

2024-03-10 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6318:
---

 Summary: Add some rules that WeTune discovers and verifies
 Key: CALCITE-6318
 URL: https://issues.apache.org/jira/browse/CALCITE-6318
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie


[WeTune|https://ipads.se.sjtu.edu.cn/_media/publications/wtune_extend.pdf] is a 
rule generator that can automatically discover new query rewrite rules for SQL 
query optimization.

 

The paper was published in 2022, and some optimization rules have recently been 
implemented in Calcite.

e.g. CALCITE-6214 is rule No.2 in Table.7 of the paper.

 

But there are still many rules that have not been implemented,
which are worth discussing.

Here are all the discovered rewrite 
[rules|https://ipads.se.sjtu.edu.cn:1312/opensource/wetune/-/blob/main/wtune_data/rules/rules.test.txt?ref_type=heads].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6240) Removed ROW_NUMBER if OVER PARTITION BY unique key

2024-02-03 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6240:
---

 Summary: Removed ROW_NUMBER if OVER PARTITION BY unique key
 Key: CALCITE-6240
 URL: https://issues.apache.org/jira/browse/CALCITE-6240
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


The SQL Only have ROW_NUMBER function.
{code:java}
SELECT empno, ename FROM ( 
   SELECT ROW_NUMBER() OVER (PARTITION BY empno) rn, empno, ename
   FROM emp
) WHERE rn <= 5{code}
to 
{code:java}
SELECT empno, ename FROM emp{code}
 

If we know every partition count is less than filter count, we can optimize
{code:java}
SELECT deptno, ename FROM ( 
   SELECT ROW_NUMBER() OVER (PARTITION BY deptno) rn, deptno, ename
   FROM emp
) WHERE rn <= N 
-- Statistic let we know every department count is less than N{code}
to
{code:java}
SELECT empno, ename FROM emp {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6214) Remove `DISTINCT` in `COUNT` if field is unique

2024-01-21 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6214:
---

 Summary: Remove `DISTINCT` in `COUNT` if field is unique
 Key: CALCITE-6214
 URL: https://issues.apache.org/jira/browse/CALCITE-6214
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


For the sql
{code:java}
select count(distinct x) cnt
from(
   select distinct sal x from emp
) t  {code}
The distinct that in count can be removed.
{code:java}
LogicalAggregate(group=[{}], CNT=[COUNT($0)])
  LogicalAggregate(group=[{0}])
LogicalProject(X=[$5])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
But `CoreRules#AGGREGATE_REMOVE` not support it, so there are two DISTINCT.
{code:java}
LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $0)])
  LogicalAggregate(group=[{0}])
LogicalProject(X=[$5])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6201) Merge Aggregate node if count distinct column has been deduplicated

2024-01-13 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6201:
---

 Summary: Merge Aggregate node if count distinct column has been 
deduplicated
 Key: CALCITE-6201
 URL: https://issues.apache.org/jira/browse/CALCITE-6201
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


distinct column is redundant because count has distinct. 
{code:java}
select count(distinct x) cnt
from(
   select distinct sal x from emp
) t {code}
PlanBefore is
{noformat}
LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $0)])
  LogicalAggregate(group=[{5}])
LogicalTableScan(table=[[CATALOG, SALES, EMP]]){noformat}
PlanAfter should be 
{noformat}
LogicalAggregate(group=[{}], CNT=[COUNT(DISTINCT $5)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]]){noformat}
But `CoreRules.AGGREGATE_MERGE` not support it, we can improve it.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6200) RelJson throw UnsupportedOperationException for RexDynamicParam

2024-01-13 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6200:
---

 Summary: RelJson throw UnsupportedOperationException for 
RexDynamicParam
 Key: CALCITE-6200
 URL: https://issues.apache.org/jira/browse/CALCITE-6200
 Project: Calcite
  Issue Type: Bug
Reporter: Jiajun Xie
Assignee: Jiajun Xie


Dump plan for the sql `select * from emp limit ?`.

RelJson will throw the exception:
{code:java}
unknown rex ?0
java.lang.UnsupportedOperationException: unknown rex ?0
    at org.apache.calcite.rel.externalize.RelJson.toJson(RelJson.java:649)
    at org.apache.calcite.rel.externalize.RelJson.toJson(RelJson.java:455)
    at 
org.apache.calcite.rel.externalize.RelJsonWriter.put(RelJsonWriter.java:102)
    at 
org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:85)
    at 
org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:135)
    at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:253)
    at org.apache.calcite.plan.RelOptUtil.dumpPlan(RelOptUtil.java:2109) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6199) Trim unused fields for SNAPSHOT and SAMPLE if table has VIRTUAL column

2024-01-13 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6199:
---

 Summary: Trim unused fields for SNAPSHOT and  SAMPLE if table has 
VIRTUAL column
 Key: CALCITE-6199
 URL: https://issues.apache.org/jira/browse/CALCITE-6199
 Project: Calcite
  Issue Type: Improvement
Reporter: Jiajun Xie
Assignee: Jiajun Xie


RelTri not support SNAPSHOT( So do Sample)
 * Here is a SQL example
{code:java}
select D, E from VIRTUALCOLUMNS.VC_T1 
for system_time as of TIMESTAMP '2011-01-02 00:00:00' {code}

 * Before Trimming
{code:java}
LogicalProject(D=[$3], E=[$4])
  LogicalSnapshot(period=[2011-01-02 00:00:00])
    LogicalProject(A=[$0], B=[$1], C=[$2], D=[$3], $f4=[+($0, 1)])
      LogicalTableScan(table=[[CATALOG, VIRTUALCOLUMNS, VC_T1]]){code}

 * After Trimming
{code:java}
LogicalSnapshot(period=[2011-01-02 00:00:00])
  LogicalProject(D=[$3], $f4=[+($0, 1)])
    LogicalTableScan(table=[[CATALOG, VIRTUALCOLUMNS, VC_T1]]){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6191) Remove join if input is one row literal

2024-01-07 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6191:
---

 Summary: Remove join if input is one row literal
 Key: CALCITE-6191
 URL: https://issues.apache.org/jira/browse/CALCITE-6191
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


Here are some Join Rules
{noformat}
at CoreRules.java | grep ' #JOIN_'
   * @see #JOIN_REDUCE_EXPRESSIONS
   * @see #JOIN_SUB_QUERY_TO_CORRELATE */
   * @see #JOIN_SUB_QUERY_TO_CORRELATE */
   * @see #JOIN_TO_SEMI_JOIN */
  /** As {@link #JOIN_COMMUTE} but swaps outer joins as well as inner joins. */
  /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but only the left input is
  /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but only the right input is
  /** As {@link #JOIN_PROJECT_BOTH_TRANSPOSE} but match outer as well as
  /** As {@link #JOIN_PROJECT_LEFT_TRANSPOSE} but match outer as well as
  /** As {@link #JOIN_PROJECT_RIGHT_TRANSPOSE} but match outer as well as
   * @see #JOIN_TO_MULTI_JOIN
{noformat}
I cannot optimize the SQL by using them.
{noformat}
SELECT EMPNO FROM emp, (SELECT 1 as c) temp WHERE EMPNO = temp.c{noformat}
Here is optimized RelNode that I want get.
{code:java}
LogicalProject(EMPNO=[$0])   
  LogicalFilter(condition=[=($0, 1)])     
     LogicalTableScan(table=[[CATALOG, SALES, EMP]]){code}
But now it is
{noformat}
LogicalProject(EMPNO=[$0])
  LogicalFilter(condition=[=($0, $9)])
    LogicalJoin(condition=[true], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalValues(tuples=[[{ 1 }]]){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6093) Unstable unit tes in Druid tests

2023-11-03 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6093:
---

 Summary: Unstable unit tes in Druid tests
 Key: CALCITE-6093
 URL: https://issues.apache.org/jira/browse/CALCITE-6093
 Project: Calcite
  Issue Type: Improvement
  Components: druid-adapter
Reporter: Jiajun Xie


FAILURE  38.6sec,  235 completed,   1 failed,   6 skipped, 
org.apache.calcite.test.DruidAdapterIT
{code:java}
FAILURE   0.2sec, org.apache.calcite.test.DruidAdapter2IT > 
testInterleaveBetweenAggregateAndGroupOrderByOnMetrics()
285java.lang.AssertionError: 
286Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; 
brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; 
A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; 
brand_name=Toretti; A=36.3"
287 but: was "store_state=OR; brand_name=ADJ; 
A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; 
brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; 
A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076" {code}

FAILURE  39.6sec,  453 completed,   2 failed,  12 skipped, Gradle Test Run 
:druid:test
{code:java}
FAILURE 28.7sec, 210 completed, 1 failed, 6 skipped, 
org.apache.calcite.test.DruidAdapter2IT 
308org.apache.calcite.test.DruidAdapterIT > 
testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker 
331FAILURE 0.3sec, org.apache.calcite.test.DruidAdapterIT > 
testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() 332 
java.lang.AssertionError: 333 Expected: "store_state=CA; brand_name=King; 
A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; 
brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; 
A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3" 334 but: was 
"store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; 
A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; 
brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076"
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6091) Char that in array is truncated in CASE WHEN statement

2023-11-03 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-6091:
---

 Summary: Char that in array is truncated in CASE WHEN statement
 Key: CALCITE-6091
 URL: https://issues.apache.org/jira/browse/CALCITE-6091
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.35.0, 1.34.0
Reporter: Jiajun Xie
Assignee: Jiajun Xie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5944) RelMdRowCount can return more accurate rowCount

2023-08-20 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5944:
---

 Summary: RelMdRowCount can return more accurate rowCount
 Key: CALCITE-5944
 URL: https://issues.apache.org/jira/browse/CALCITE-5944
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5943) RelMdSelectivity can return more precise selectivity for Sample

2023-08-20 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5943:
---

 Summary: RelMdSelectivity can return more precise selectivity for 
Sample
 Key: CALCITE-5943
 URL: https://issues.apache.org/jira/browse/CALCITE-5943
 Project: Calcite
  Issue Type: Bug
Reporter: Jiajun Xie
Assignee: Jiajun Xie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5805) SelectNamespace throws NullPointerException while validating MERGE statement

2023-06-28 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5805:
---

 Summary: SelectNamespace throws NullPointerException while 
validating MERGE statement
 Key: CALCITE-5805
 URL: https://issues.apache.org/jira/browse/CALCITE-5805
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


{code:java}
final String sql = "merge into empnullables e "
+ "using (select * from emp where deptno is null) t "
+ "on e.empno = t.empno "
+ "when matched then update "
+ "set ename = t.ename, deptno = t.deptno, sal = t.sal * .1 "
+ "when not matched then insert (empno, ename, deptno, sal) "
+ "values(t.empno, t.ename, 10, t.sal * .15)";
sql(sql).ok(); // Expected it is ok, but failed{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5796) DELETE statement is not effective

2023-06-24 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5796:
---

 Summary: DELETE statement is not effective 
 Key: CALCITE-5796
 URL: https://issues.apache.org/jira/browse/CALCITE-5796
 Project: Calcite
  Issue Type: Bug
  Components: core, linq4j
 Environment: I try to use `DELETE FROM t`, but the table was not 
affected.

 
{code:java}
static void erase(SqlIdentifier name, CalcitePrepare.Context context) {
  // Generate, prepare and execute an "DELETE FROM table" statement.
  // (It's a bit inefficient that we convert from SqlNode to SQL and back
  // again.)
  final FrameworkConfig config = Frameworks.newConfigBuilder()
  .defaultSchema(context.getRootSchema().plus())
  .build();
  final Planner planner = Frameworks.getPlanner(config);
  try {
final StringBuilder buf = new StringBuilder();
final SqlWriterConfig writerConfig =
SqlPrettyWriter.config().withAlwaysUseParentheses(false);
final SqlPrettyWriter w = new SqlPrettyWriter(writerConfig, buf);
buf.append("DELETE FROM ");
name.unparse(w, 0, 0);
final String sql = buf.toString();
final SqlNode query1 = planner.parse(sql);
final SqlNode query2 = planner.validate(query1);
final RelRoot r = planner.rel(query2);
final PreparedStatement prepare =
context.getRelRunner().prepareStatement(r.rel);
int rowCount = prepare.executeUpdate();
Util.discard(rowCount);
prepare.close();
  } catch (SqlParseException | ValidationException
   | RelConversionException | SQLException e) {
throw Util.throwAsRuntime(e);
  }
} {code}
 

Because we use `EnumerableDefaults#remove()` to delete target rows. 

The rows is object array, they can't match.

!image-2023-06-24-18-27-57-722.png!
Reporter: Jiajun Xie
 Attachments: image-2023-06-24-18-27-57-722.png





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5630) Window with rows equivalence error in volcano planner

2023-04-03 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5630:
---

 Summary: Window with rows equivalence error in volcano planner
 Key: CALCITE-5630
 URL: https://issues.apache.org/jira/browse/CALCITE-5630
 Project: Calcite
  Issue Type: Bug
Reporter: Jiajun Xie


Here is UT
{code:java}
@Test void testUnionWindow() {
  String selectSqlBase =
  "SELECT 'col{?}' as col, sum(\"salary\" ) OVER("
  + "PARTITION BY \"deptno\" "
  + "ORDER BY \"empid\" "
  + "ROWS BETWEEN CURRENT ROW AND {?} FOLLOWING) "
  + "FROM  \"hr\".\"emps\"";
  String selectSql1 = selectSqlBase.replace("{?}", "1"); // window1
  String selectSql2 = selectSqlBase.replace("{?}", "2"); // window2
  String sql = selectSql1 + "\nunion all\n" + selectSql2;
  CalciteAssert.hr().query(sql).runs()
  .returns("COL=col1; EXPR$1=8000.0\n"
  + "COL=col1; EXPR$1=21500.0\n"
  + "COL=col1; EXPR$1=18500.0\n"
  + "COL=col1; EXPR$1=7000.0\n"
  + "COL=col2; EXPR$1=8000.0\n"
  + "COL=col2; EXPR$1=21500.0\n"
  + "COL=col2; EXPR$1=18500.0\n"
  + "COL=col2; EXPR$1=7000.0\n"); 
 // sum that for col1 is same as sum that for col2, this is error
}
 {code}
Because RelDigest is same between window1 and window2,

volcano planner use window1 replace window2.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5597) Column that be order by should not be aggregate

2023-03-20 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5597:
---

 Summary: Column that be order by should not be aggregate
 Key: CALCITE-5597
 URL: https://issues.apache.org/jira/browse/CALCITE-5597
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Jiajun Xie


Here is unit test
{code:java}
@Test void testDistinctOrderByRand() {
  final String sql = "select distinct deptno from emp order by rand()";
  sql(sql).ok();
} {code}
We will get error logical plan
{code:java}
LogicalProject(DEPTNO=[$0])
  LogicalSort(sort0=[$1], dir0=[ASC])
LogicalAggregate(group=[{0, 1}]) //rand() in group, result will be error
  LogicalProject(DEPTNO=[$7], EXPR$1=[RAND()])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5590) NullPointerException when converting 'in' expression that is used inside select list and group by

2023-03-16 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5590:
---

 Summary: NullPointerException when converting  'in' expression 
that is used inside select list and group by
 Key: CALCITE-5590
 URL: https://issues.apache.org/jira/browse/CALCITE-5590
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.33.0
Reporter: Jiajun Xie


In CALCITE-5209, we changed sub-query handle way.

It can solve some problems, but it also introduces some new problems.

Here is a simple case to reproduce, it can work before 1.33.0
{code:java}
@Test void testGroupByCaseWhen() {
final String sql = "select case when deptno in (1) then 1 end as col\n"
+ "from emp\n"
+ "group by deptno, case when deptno in (1) then 1 else 0 end";
sql(sql).ok();
} {code}
{code:java}
while converting CASE WHEN `EMP`.`DEPTNO` IN (1) THEN 1 ELSE NULL END
java.lang.RuntimeException: while converting CASE WHEN `EMP`.`DEPTNO` IN (1) 
THEN 1 ELSE NULL END
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:86)
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5591)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4875)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:5454)
at 
org.apache.calcite.sql2rel.StandardConvertletTable.lambda$new$9(StandardConvertletTable.java:205)
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5591)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4875)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:5454)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:3316)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:3158)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:784)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:682)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3680)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:597)
at 
org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertSqlToRel(SqlToRelTestBase.java:639)
at 
org.apache.calcite.test.SqlToRelTestBase$TesterImpl.assertConvertsTo(SqlToRelTestBase.java:758)
at 
org.apache.calcite.test.SqlToRelConverterTest$Sql.convertsTo(SqlToRelConverterTest.java:4502)
at 
org.apache.calcite.test.SqlToRelConverterTest$Sql.ok(SqlToRelConverterTest.java:4494)
at 
org.apache.calcite.test.SqlToRelConverterTest.testGroupByCaseWhen(SqlToRelConverterTest.java:4028){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5525) GROUPING_ID() should be rewritten to GROUPING() in some dialects

2023-02-10 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5525:
---

 Summary: GROUPING_ID() should be rewritten to GROUPING() in some 
dialects
 Key: CALCITE-5525
 URL: https://issues.apache.org/jira/browse/CALCITE-5525
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


Some databases not exists `grouping_id()`, `grouping()` is used as 
`grouping_id()`.
 - Postgresql
{code:java}
ERROR:  function grouping_id(integer, character varying) does not exist
LINE 1: select grouping(id), grouping(name), grouping_id(id, name), ...
 ^
HINT:  No function matches the given name and argument types. You might need to 
add explicit type casts
{code}

 - Presto
{code:java}
Function grouping_id not registered
{code}

 

Here is a simple query, `grouping_id()` is a legal function in Spark.
{code:java}
select  a,
        b,
        count(*),
        grouping(a) ga,
        grouping(b) gb,
        grouping_id(a, b)
from    (
            select  1 as a, 2 as b
        )
group by grouping sets((a, b), (b), ())
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5507) HAVING alias failed when aggregate function in condition

2023-01-28 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5507:
---

 Summary: HAVING alias failed when aggregate function in condition
 Key: CALCITE-5507
 URL: https://issues.apache.org/jira/browse/CALCITE-5507
 Project: Calcite
  Issue Type: Improvement
Reporter: Jiajun Xie


We know that calcite can support HAVING alias by setting 
SqlConformanceEnum.LENIENT
{code:java}
sql("select count(empno) as e from emp having e > 10")
.withConformance(lenient).ok(); {code}
but when I add one aggregate function in HAVING clause, it will fail.
{code:java}
sql("select count(empno) as e from emp having ^e^ > 10 and count(empno) > 10 ")
    .withConformance(lenient).fails("Column 'E' not found in any table"); 
// I think it should be ok{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5506) RelToSqlConverter get error result because RelFieldTrimmer lost aggregate function

2023-01-28 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5506:
---

 Summary: RelToSqlConverter get error result because 
RelFieldTrimmer lost aggregate function
 Key: CALCITE-5506
 URL: https://issues.apache.org/jira/browse/CALCITE-5506
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie


Here is a sql

 
{code:java}
select  l.v as l_cost
from    (
            select  0 as v,
                    1 as k
        ) l
join    (
    select  sum("cost") as v,
                    1 as k
    from    (
        select 1 as "cost"
        union
        select 2 as "cost"
        )
) r
on l.k = r.k {code}
Before trimming, the RelNode is 

 

 
{code:java}
LogicalProject(L_COST=[$0])
  LogicalJoin(condition=[=($1, $3)], joinType=[inner])
    LogicalValues(tuples=[[{ 0, 1 }]])
    LogicalProject(V=[$0], K=[1])
      LogicalAggregate(group=[{}], V=[SUM($0)]) 
         LogicalUnion(all=[false])
          LogicalValues(tuples=[[{ 1 }]])
          LogicalValues(tuples=[[{ 2 }]]) {code}
After trimming, the RelNode is

 

 
{code:java}
LogicalProject(L_COST=[$0])
  LogicalJoin(condition=[=($1, $2)], joinType=[inner])
    LogicalValues(tuples=[[{ 0, 1 }]])
    LogicalProject(K=[1]) 
      LogicalAggregate(group=[{}], DUMMY=[COUNT()]) 
        LogicalUnion(all=[false])
          LogicalValues(tuples=[[{ 1 }]])
          LogicalValues(tuples=[[{ 2 }]]){code}
 

If we convert trimmed RelNode to sql, the sql will be
{code:java}
SELECT *
FROM 
  (VALUES (0, 1)) AS "t" ("V", "K")
INNER JOIN 
  (SELECT 1 AS "K" -- Missing SUM()
   FROM (SELECT *
 FROM (VALUES (1)) AS "t" ("cost")
 UNION
 SELECT *
 FROM (VALUES (2)) AS "t" ("cost")) AS "t2"
  ) AS "t4" 
ON "t"."K" = "t4"."K" {code}
The origin sql only has one row result, but the new sql that be trimmed has two 
row result.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5486) Subquery not support having alias in condition.

2023-01-19 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5486:
---

 Summary: Subquery not support having alias in condition.
 Key: CALCITE-5486
 URL: https://issues.apache.org/jira/browse/CALCITE-5486
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie


{code:java}
sql("select * from emp where sal >\n"
+ " (select avg(sal) as s"
+ "from emp having ^s^ > 0"
+ "  )")
.withConformance(SqlConformanceEnum.LENIENT)
.fails("Column 'S' not found in any table");{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5392) Support Snapshot in RelMdExpressionLineage

2022-11-19 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5392:
---

 Summary: Support Snapshot in RelMdExpressionLineage
 Key: CALCITE-5392
 URL: https://issues.apache.org/jira/browse/CALCITE-5392
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


 For example:
{code:java}
select productid from products_temporal
for system_time as of TIMESTAMP '2022-11-19 00:00:00' 
{code}
The `productid` expression lineage is null because `RelMdExpressionLineage` 
does not implement the corresponding method.

 

We can implement method that for SNAPSHOT like other RelNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5205) Supports hint option as string and numeric literal

2022-07-09 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5205:
---

 Summary: Supports hint option as string and numeric literal
 Key: CALCITE-5205
 URL: https://issues.apache.org/jira/browse/CALCITE-5205
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


 Spark support hint that options contain string and number:
{code:java}
SELECT /*+ REPARTITION(3, c) */ * FROM t {code}
But calcite can't parse it:
{code:java}
Error while parsing SQL: select /*+ repartition(3, empno) */ empno, ename, 
deptno from emps
java.lang.RuntimeException: Error while parsing SQL: select /*+ repartition(3, 
empno) */ empno, ename, deptno from emps {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5164) Planner#parser can't parse TIMESTAMP() function

2022-05-20 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5164:
---

 Summary: Planner#parser can't parse TIMESTAMP() function
 Key: CALCITE-5164
 URL: https://issues.apache.org/jira/browse/CALCITE-5164
 Project: Calcite
  Issue Type: Bug
  Components: babel
Affects Versions: 1.30.0
Reporter: Jiajun Xie
Assignee: Jiajun Xie


Both core and babel will parse fail.
{code:java}
  FrameworkConfig coreConfig = Frameworks.newConfigBuilder().build();
  Planner corePlanner = Frameworks.getPlanner(coreConfig);
  corePlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'");
  // Caused by: org.apache.calcite.sql.parser.babel.ParseException: 
Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8.

  FrameworkConfig babelConfig = Frameworks.newConfigBuilder()
  .parserConfig(SqlParser.Config.DEFAULT.withParserFactory(
  SqlBabelParserImpl.FACTORY))
  .build();
  Planner babelPlanner = Frameworks.getPlanner(babelConfig);
  babelPlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'");
  // Caused by: org.apache.calcite.sql.parser.babel.ParseException: 
Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8
{code}
Here are some databases that support TIMESTAMP function.
 - MySQL: 
[https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_timestamp]

{code:java}
select timestamp('2022-05-21 08:00:00')
// result
timestamp('2022-05-21 08:00:00') 
2022-05-21 08:00:00
{code}
 - Derby: [https://docs.oracle.com/javadb/10.6.2.1/ref/rreftimestampfunc.html]

Also, here are some databases that not support TIMESTAMP function:
 - Oracle: 
[https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm]

{code:java}
select timestamp('2022-05-21 08:00:00')
//ORA-00923: FROM keyword not found where expected
{code}
 - SQL Server: 
[https://docs.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?view=sql-server-ver15]

{code:java}
select timestamp('2022-05-21 08:00:00')
// Msg 195 Level 15 State 10 Line 1
// 'timestamp' is not a recognized built-in function name.{code}
Is it necessary for us to support it in babel module?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (CALCITE-5141) Implicit type conversion incomplete for insert values

2022-05-08 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5141:
---

 Summary: Implicit type conversion incomplete for insert values
 Key: CALCITE-5141
 URL: https://issues.apache.org/jira/browse/CALCITE-5141
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Jiajun Xie
Assignee: Jiajun Xie


- Target table schema
{code:java}
// Register "T1" table.
final MockTable t1 =
MockTable.create(this, tSchema, "T1", false, 7.0, null);
t1.addColumn("t1_varchar20", f.varchar20Type, true);
t1.addColumn("t1_smallint", f.smallintType);
t1.addColumn("t1_int", f.intType);
t1.addColumn("t1_bigint", f.bigintType);
t1.addColumn("t1_float", f.floatType);
t1.addColumn("t1_double", f.doubleType);
t1.addColumn("t1_decimal", f.decimalType);
t1.addColumn("t1_timestamp", f.timestampType);
t1.addColumn("t1_date", f.dateType);
t1.addColumn("t1_binary", f.binaryType);
t1.addColumn("t1_boolean", f.booleanType);
registerTable(t1);
{code}

 - Insert query
{code:java}
insert into t1 values 
('a', 1, 1.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', 
x'0A', false), 
('b', 2,  2, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', 
x'0A', false), 
('c', CAST(3 AS SMALLINT),  3.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', 
date '2021-11-28', x'0A', false), 
('d', 4, 4.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', 
x'0A', false), 
('e', 5, 5.0, 0, 0, 0, 0, TIMESTAMP '2021-11-28 00:00:00', date '2021-11-28', 
x'0A', false)
{code}

 - Incorrect converted plan: the data type is not matched with table schema
{code:java}
LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
flattened=[false])
  LogicalValues(tuples=[[
  { 'a', 1, 1, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
  { 'b', 2, 2, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
  { 'c', 3, 3.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
  { 'd', 4, 4.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
  { 'e', 5, 5.0, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }]])
{code}

 - Correct converted plan

{code:java}
LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
flattened=[false])
  LogicalValues(tuples=[[
{ 'a', 1, 1, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
 { 'b', 2, 2, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
 { 'c', 3, 3, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false },
 { 'd', 4, 4, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }, 
{ 'e', 5, 5, 0, 0, 0, 0, 2021-11-28 00:00:00, 2021-11-28, X'0a', false }]])
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (CALCITE-5091) RelMdRowCount can return more accurate rowCount when fetch is deterministic and offset is dynamic

2022-04-11 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5091:
---

 Summary: RelMdRowCount can return more accurate rowCount when 
fetch is deterministic and offset is dynamic
 Key: CALCITE-5091
 URL: https://issues.apache.org/jira/browse/CALCITE-5091
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.30.0
Reporter: Jiajun Xie
Assignee: Jiajun Xie


The sql from https://issues.apache.org/jira/browse/CALCITE-5048
{code:java}
select * from emp order by ename limit 1 offset ?
{code}

*RelMdRowCount#getRowCount* will return 14d, because it always return rowCount 
when offset is dynamic.
{code:java}
if (rel.offset instanceof RexDynamicParam) {
  return rowCount;
}
{code}

Obviously, rowCount will not exceed 1 because* LIMIT 1*. We can improve it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (CALCITE-5013) Parentheses should be reserved when the operand of SetOp has limit or offset

2022-02-19 Thread Jiajun Xie (Jira)
Jiajun Xie created CALCITE-5013:
---

 Summary: Parentheses should be reserved when the operand of SetOp 
has limit or offset
 Key: CALCITE-5013
 URL: https://issues.apache.org/jira/browse/CALCITE-5013
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Jiajun Xie


- In standard SQL, the operand of union should not have limit  or order by.So 
parse will fail in SqlParserTetest#testLimitUnion and 
SqlParserTetest#OrderUnion.

 - When users use parentheses, most engines allow it. See the 
discussion:[CALCITE-1892|https://issues.apache.org/jira/browse/CALCITE-1892].   



--
This message was sent by Atlassian Jira
(v8.20.1#820001)