[jira] [Created] (CALCITE-5529) Improve dialect tests
Julian Hyde created CALCITE-5529: Summary: Improve dialect tests Key: CALCITE-5529 URL: https://issues.apache.org/jira/browse/CALCITE-5529 Project: Calcite Issue Type: Bug Reporter: Julian Hyde The current dialect tests are quick and robust but not very thorough. In {{RelToSqlConverterTest}} there are about 400 tests (374 SQL statements and 30 relational expressions) that test translation to 538 SQL strings in target dialects (an average of 1.34 dialects per test). None of those SQL strings are executed, so we don't know whether they return the correct results or are even valid. Improvements: * Define a list of dialects that we wish to test (the "target dialects") and provide a means to connect to a database for each of these (typically a driver on the class path, a URL, user name and password); * In each test, for each target dialect, generate a SQL string. The generation should not throw. * In each test, the query is first executed against Calcite to create a sample. (For any test, this can be overridden to use a different reference dialect. Calcite will use a recording, if present.) * In each test, for each target dialect that has an *enabled* connection, execute the generated SQL against that connection. If there is a reference result for the test, check that the result equals the reference result. * In each test, for each target dialect that has a *recording* connection, execute the generated SQL against that connection. Record the outcome (error, column names and types, result) in the file. Proceed as for an enabled connection. (Check result against reference result, etc.) * In each test, for each target dialect that has a *replaying* connection, look up the outcome of the generated SQL. If the outcome is present, proceed as for an enabled connection. (Check result against reference result, etc.) * In each test, for each target dialect that has a *replaying+recording* connection, look up the outcome of the generated SQL. If the outcome is present, proceed as for an enabled connection. (Check result against reference result, etc.) If the outcome is not present, proceed as for a recording connection. (Execute the SQL, record the outcome, check against reference result, etc.) With these improvements, the test suite: * Remains fast and robust (because recordings are used most of the time); * Can be run offline (using recordings); * Can be run by contributors who do not necessarily have access to all of the supported databases; * Has full coverage (runs all tests against all supported dialects); * Can re-generate the 'golden' references. (Must be online, have the required driver and connection, and takes more time.) Design: * A new {{class DialectTestConfig}} describes the dialects to be tested, their names, connections, the connection state (enabled, recording, replaying, replaying+recording), and recording file name. * A mock JDBC driver mediates recording/replay. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5528) RelOptRulesTest testPushAggregateThroughJoin7 optimized query not equivalent to the original one
Pinhan Zhao created CALCITE-5528: Summary: RelOptRulesTest testPushAggregateThroughJoin7 optimized query not equivalent to the original one Key: CALCITE-5528 URL: https://issues.apache.org/jira/browse/CALCITE-5528 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.32.0 Reporter: Pinhan Zhao [https://github.com/apache/calcite/blob/fb340ece8c5e60ecfd6371950f8dcb665c85a712/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L5046] After converting the testPushAggregateThroughJoin7's original plan and optimized plan to PostgreSQL using {code:java} converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n', ''); {code} , we have queries: {code:sql} SELECT ANY_VALUE(t0.SAL) FROM EMP INNER JOIN (SELECT SAL FROM EMP GROUP BY SAL) AS t0 ON EMP.SAL = t0.SAL{code} and {code:sql} SELECT ANY_VALUE(t4.SAL) FROM (SELECT SAL FROM EMP GROUP BY SAL) AS t3 INNER JOIN (SELECT SAL FROM EMP GROUP BY SAL) AS t4 ON t3.SAL = t4.SAL{code} However, they are semantically different with the following counterexample being able to distinguish them: {code:sql} DEPT-- CREATE TABLE DEPT ( DEPTNO INTEGER PRIMARY KEY, NAME VARCHAR(20) ); INSERT INTO DEPT VALUES (0,'0'); INSERT INTO DEPT VALUES (-1,'0'); EMP-- CREATE TABLE EMP ( EMPNO INTEGER PRIMARY KEY, DEPTNO INTEGER REFERENCES DEPT(DEPTNO), ENAME VARCHAR(20), JOB VARCHAR(20), MGR INTEGER, HIREDATE DATE, SAL INTEGER, COMM INTEGER, SLACKER BOOLEAN ); INSERT INTO EMP VALUES (0 , -1 , NULL , NULL , NULL , NULL , 0 , 0 , True); INSERT INTO EMP VALUES (-1 , 0 , NULL , '0' , NULL , NULL , 0 , 0 , False);{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5527) RelOptRulesTest testAnyInProjectNullable optimized query not equivalent to the original one
Pinhan Zhao created CALCITE-5527: Summary: RelOptRulesTest testAnyInProjectNullable optimized query not equivalent to the original one Key: CALCITE-5527 URL: https://issues.apache.org/jira/browse/CALCITE-5527 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.32.0 Reporter: Pinhan Zhao https://github.com/apache/calcite/blob/fb340ece8c5e60ecfd6371950f8dcb665c85a712/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L6114 After converting the testAnyInProjectNullable's original plan and optimized plan to PostgreSQL using {code:java} converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n', ''); {code} , we have queries: {code:sql} SELECT DEPTNO, NAME IN (SELECT MGR FROM EMP) FROM DEPT{code} and {code:sql} SELECT DEPT0.DEPTNO, t5.i IS NOT NULL AND t2.c <> 0 OR t2.ck < t2.c AND NULL AND t2.c <> 0 AND t5.i IS NULL FROM DEPT AS DEPT0 CROSS JOIN (SELECT COUNT(*) AS c, COUNT(MGR) AS ck FROM EMP) AS t2 LEFT JOIN (SELECT EMP1.MGR, t3.i FROM EMP AS EMP1, (VALUES (TRUE)) AS t3 (i) GROUP BY EMP1.MGR, t3.i) AS t5 ON DEPT0.NAME = t5.MGR{code} However, they are semantically different with the following counterexample being able to distinguish them: {code:sql} DEPT-- CREATE TABLE DEPT ( DEPTNO INTEGER PRIMARY KEY, NAME VARCHAR(20) ); INSERT INTO DEPT VALUES (0,NULL); INSERT INTO DEPT VALUES (-1,'2'); EMP-- CREATE TABLE EMP ( EMPNO INTEGER PRIMARY KEY, DEPTNO INTEGER REFERENCES DEPT(DEPTNO), ENAME VARCHAR(20), JOB VARCHAR(20), MGR VARCHAR(20), HIREDATE DATE, SAL INTEGER, COMM INTEGER, SLACKER BOOLEAN ); INSERT INTO EMP VALUES (0 , -1 , '0' , '0' , 1 , '2000-01-01' , 0 , 0 , True); INSERT INTO EMP VALUES (1 , -1 , '0' , '0' , 0 , NULL , 0 , NULL , False);{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: RexExecutorImpl and RexSubQuery
I took a deeper look today following your recommendation and ran into a problem when I got into the implementation of the program builder. I think the problem might be higher up since the error appears to be that it's attempting to reduce a subquery as a constant. I found this section of code: https://github.com/apache/calcite/blob/79879f92abc4f2279496a0e6ca9066d9d24a1457/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java#L1088-L1091 . It appears to check the arguments to the subquery to see if any of them are reducible to determine if the subquery itself is reducible. Since the scalar query doesn't have any operands, it determines that the call is a reducible constant. But, I'd probably guess that the subquery itself is never reducible and, if it was desirable to reduce it, then that should be done through a separate planner rule that expands the subquery into other nodes. I changed this code to use "analyzeCall(subQuery, Constancy.IRREDUCIBLE_CONSTANT);" and this seems to prevent the reducer from attempting to reduce the subquery. Does this sound problematic and would it be the correct fix for this? If so, I can prepare a PR and submit it upstream. --Jonathan Sternberg On Mon, Feb 13, 2023 at 6:52 PM Julian Hyde wrote: > We need to decide whether a RexSubQuery can appear inside a RexProgram. > > If yes, then RegisterInputShuttle should be extended to handle it, > register it, and then it will become a RexLocalRef. > > If no, you will need to apply a rule to expand all RexSubQuery instances > before the phase where you create a RexProgram. > > I am inclined to say yes. Or at least give it a try, and see whether you > can get it to work without too much pain. > > Julian > > > > On Feb 13, 2023, at 2:40 PM, Jonathan Sternberg > wrote: > > > > Hi, > > > > I'm attempting to update some code that previously used > `withExpand(true)` > > to use `withExpand(false)`. This has resulted in some plans that > previously > > used left joins to start using `SCALAR_QUERY`. > > > > The code also uses a HepPlanner > > < > https://calcite.apache.org/javadocAggregate/org/apache/calcite/plan/hep/HepPlanner.html > > > > with > > the FilterReduceExpressionsRule > > < > https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/rules/ReduceExpressionsRule.FilterReduceExpressionsRule.html#%3Cinit%3E(java.lang.Class,boolean,org.apache.calcite.tools.RelBuilderFactory) > >. > > We use the default configuration and add it to the HepPlanner and then > call > > "findBestExp". > > > > When the planner runs, I get a ClassCastException here: > > > https://github.com/apache/calcite/blob/50d124615e0b07f7fbe6107b7c440d9737a00836/core/src/main/java/org/apache/calcite/rex/RexProgramBuilder.java#L303 > > < > https://github.com/apache/calcite/blob/50d124615e0b07f7fbe6107b7c440d9737a00836/core/src/main/java/org/apache/calcite/rex/RexExecutorImpl.java#L76-L77 > > > > > > It seems when the RegisterInputShuttle does its thing, it returns a > > RexSubQuery instead of a RexLocalRef. Is this a bug or am I doing > something > > wrong with this planner rule? I see this section of code for handling a > > RexCall and I assume that registerInternal returns a RexLocalRef, but > > visitSubQuery has its own implementation in the RexShuttle which calls > > visitList on the subquery. Is this a bug or is there another way I should > > be doing this? > > > > Thanks. > > > > --Jonathan Sternberg > >