[jira] [Created] (CALCITE-5529) Improve dialect tests

2023-02-14 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-5529:


 Summary: Improve dialect tests
 Key: CALCITE-5529
 URL: https://issues.apache.org/jira/browse/CALCITE-5529
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde


The current dialect tests are quick and robust but not very thorough.  In 
{{RelToSqlConverterTest}} there are about 400 tests (374 SQL statements and 30 
relational expressions) that test translation to 538 SQL strings in target 
dialects (an average of 1.34 dialects per test). None of those SQL strings are 
executed, so we don't know whether they return the correct results or are even 
valid.

Improvements:
 * Define a list of dialects that we wish to test (the "target dialects") and 
provide a means to connect to a database for each of these (typically a driver 
on the class path, a URL, user name and password);
 * In each test, for each target dialect, generate a SQL string. The generation 
should not throw.
 * In each test, the query is first executed against Calcite to create a 
sample. (For any test, this can be overridden to use a  different reference 
dialect. Calcite will use a recording, if present.)
 * In each test, for each target dialect that has an *enabled* connection, 
execute the generated SQL against that connection. If there is a reference 
result for the test,  check that the result equals the reference result.
 * In each test, for each target dialect that has a *recording* connection, 
execute the generated SQL against that connection. Record the outcome (error, 
column names and types, result) in the file. Proceed as for an enabled 
connection. (Check result against reference result, etc.)
 * In each test, for each target dialect that has a *replaying* connection, 
look up the outcome of the generated SQL. If the outcome is present, proceed as 
for an enabled connection. (Check result against reference result, etc.)
 * In each test, for each target dialect that has a *replaying+recording* 
connection, look up the outcome of the generated SQL. If the outcome is 
present, proceed as for an enabled connection. (Check result against reference 
result, etc.) If the outcome is not present, proceed as for a recording 
connection. (Execute the SQL, record the outcome, check against reference 
result, etc.)

With these improvements, the test suite:
 * Remains fast and robust (because recordings are used most of the time);
 * Can be run offline (using recordings);
 * Can be run by contributors who do not necessarily have access to all of the 
supported databases;
 * Has full coverage (runs all tests against all supported dialects);
 * Can re-generate the 'golden' references. (Must be online, have the required 
driver and connection, and takes more time.)

Design:
 * A new {{class DialectTestConfig}} describes the dialects to be tested, their 
names, connections, the connection state (enabled, recording, replaying, 
replaying+recording), and recording file name.
 * A mock JDBC driver mediates recording/replay.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5528) RelOptRulesTest testPushAggregateThroughJoin7 optimized query not equivalent to the original one

2023-02-14 Thread Pinhan Zhao (Jira)
Pinhan Zhao created CALCITE-5528:


 Summary: RelOptRulesTest testPushAggregateThroughJoin7 optimized 
query not equivalent to the original one
 Key: CALCITE-5528
 URL: https://issues.apache.org/jira/browse/CALCITE-5528
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.32.0
Reporter: Pinhan Zhao


[https://github.com/apache/calcite/blob/fb340ece8c5e60ecfd6371950f8dcb665c85a712/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L5046]

After converting the testPushAggregateThroughJoin7's original plan and 
optimized plan to PostgreSQL using
{code:java}
converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n',
 '');
{code}
, we have queries:
{code:sql}
SELECT ANY_VALUE(t0.SAL) FROM EMP INNER JOIN (SELECT SAL FROM EMP GROUP BY SAL) 
AS t0 ON EMP.SAL = t0.SAL{code}
and
{code:sql}
SELECT ANY_VALUE(t4.SAL) FROM (SELECT SAL FROM EMP GROUP BY SAL) AS t3 INNER 
JOIN (SELECT SAL FROM EMP GROUP BY SAL) AS t4 ON t3.SAL = t4.SAL{code}
 

However, they are semantically different with the following counterexample 
being able to distinguish them:
{code:sql}
DEPT--
CREATE TABLE DEPT (
DEPTNO INTEGER PRIMARY KEY,
NAME VARCHAR(20)
);
INSERT INTO DEPT VALUES (0,'0');
INSERT INTO DEPT VALUES (-1,'0');
EMP--
CREATE TABLE EMP (
EMPNO INTEGER PRIMARY KEY,
DEPTNO INTEGER REFERENCES DEPT(DEPTNO),
ENAME VARCHAR(20),
JOB VARCHAR(20),
MGR INTEGER,
HIREDATE DATE,
SAL INTEGER,
COMM INTEGER,
SLACKER BOOLEAN
);
INSERT INTO EMP VALUES (0 , -1 , NULL , NULL , NULL , NULL , 0 , 0 , True);
INSERT INTO EMP VALUES (-1 , 0 , NULL , '0' , NULL , NULL , 0 , 0 , 
False);{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5527) RelOptRulesTest testAnyInProjectNullable optimized query not equivalent to the original one

2023-02-14 Thread Pinhan Zhao (Jira)
Pinhan Zhao created CALCITE-5527:


 Summary: RelOptRulesTest testAnyInProjectNullable optimized query 
not equivalent to the original one
 Key: CALCITE-5527
 URL: https://issues.apache.org/jira/browse/CALCITE-5527
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.32.0
Reporter: Pinhan Zhao


https://github.com/apache/calcite/blob/fb340ece8c5e60ecfd6371950f8dcb665c85a712/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L6114

After converting the testAnyInProjectNullable's original plan and optimized 
plan to PostgreSQL using
{code:java}
converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n',
 '');
{code}
, we have queries:
 
{code:sql}
SELECT DEPTNO, NAME IN (SELECT MGR FROM EMP) FROM DEPT{code}

and

{code:sql}
SELECT DEPT0.DEPTNO, t5.i IS NOT NULL AND t2.c <> 0 OR t2.ck < t2.c AND NULL 
AND t2.c <> 0 AND t5.i IS NULL FROM DEPT AS DEPT0 CROSS JOIN (SELECT COUNT(*) 
AS c, COUNT(MGR) AS ck FROM EMP) AS t2 LEFT JOIN (SELECT EMP1.MGR, t3.i FROM 
EMP AS EMP1, (VALUES (TRUE)) AS t3 (i) GROUP BY EMP1.MGR, t3.i) AS t5 ON 
DEPT0.NAME = t5.MGR{code}

However, they are semantically different with the following counterexample 
being able to distinguish them:
{code:sql}
DEPT--
CREATE TABLE DEPT (
DEPTNO INTEGER PRIMARY KEY,
NAME VARCHAR(20)
);
INSERT INTO DEPT VALUES (0,NULL);
INSERT INTO DEPT VALUES (-1,'2');
EMP--
CREATE TABLE EMP (
EMPNO INTEGER PRIMARY KEY,
DEPTNO INTEGER REFERENCES DEPT(DEPTNO),
ENAME VARCHAR(20),
JOB VARCHAR(20),
MGR VARCHAR(20),
HIREDATE DATE,
SAL INTEGER,
COMM INTEGER,
SLACKER BOOLEAN
);
INSERT INTO EMP VALUES (0 , -1 , '0' , '0' , 1 , '2000-01-01' , 0 , 0 , True);
INSERT INTO EMP VALUES (1 , -1 , '0' , '0' , 0 , NULL , 0 , NULL , False);{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: RexExecutorImpl and RexSubQuery

2023-02-14 Thread Jonathan Sternberg
I took a deeper look today following your recommendation and ran into a
problem when I got into the implementation of the program builder. I think
the problem might be higher up since the error appears to be that it's
attempting to reduce a subquery as a constant. I found this section of
code:
https://github.com/apache/calcite/blob/79879f92abc4f2279496a0e6ca9066d9d24a1457/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java#L1088-L1091
.

It appears to check the arguments to the subquery to see if any of them are
reducible to determine if the subquery itself is reducible. Since the
scalar query doesn't have any operands, it determines that the call is a
reducible constant. But, I'd probably guess that the subquery itself is
never reducible and, if it was desirable to reduce it, then that should be
done through a separate planner rule that expands the subquery into other
nodes. I changed this code to use "analyzeCall(subQuery,
Constancy.IRREDUCIBLE_CONSTANT);" and this seems to prevent the reducer
from attempting to reduce the subquery. Does this sound problematic and
would it be the correct fix for this? If so, I can prepare a PR and submit
it upstream.

--Jonathan Sternberg

On Mon, Feb 13, 2023 at 6:52 PM Julian Hyde  wrote:

> We need to decide whether a RexSubQuery can appear inside a RexProgram.
>
> If yes, then RegisterInputShuttle should be extended to handle it,
> register it, and then it will become a RexLocalRef.
>
> If no, you will need to apply a rule to expand all RexSubQuery instances
> before the phase where you create a RexProgram.
>
> I am inclined to say yes. Or at least give it a try, and see whether you
> can get it to work without too much pain.
>
> Julian
>
>
> > On Feb 13, 2023, at 2:40 PM, Jonathan Sternberg 
> wrote:
> >
> > Hi,
> >
> > I'm attempting to update some code that previously used
> `withExpand(true)`
> > to use `withExpand(false)`. This has resulted in some plans that
> previously
> > used left joins to start using `SCALAR_QUERY`.
> >
> > The code also uses a HepPlanner
> > <
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/plan/hep/HepPlanner.html
> >
> > with
> > the FilterReduceExpressionsRule
> > <
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/rules/ReduceExpressionsRule.FilterReduceExpressionsRule.html#%3Cinit%3E(java.lang.Class,boolean,org.apache.calcite.tools.RelBuilderFactory)
> >.
> > We use the default configuration and add it to the HepPlanner and then
> call
> > "findBestExp".
> >
> > When the planner runs, I get a ClassCastException here:
> >
> https://github.com/apache/calcite/blob/50d124615e0b07f7fbe6107b7c440d9737a00836/core/src/main/java/org/apache/calcite/rex/RexProgramBuilder.java#L303
> > <
> https://github.com/apache/calcite/blob/50d124615e0b07f7fbe6107b7c440d9737a00836/core/src/main/java/org/apache/calcite/rex/RexExecutorImpl.java#L76-L77
> >
> >
> > It seems when the RegisterInputShuttle does its thing, it returns a
> > RexSubQuery instead of a RexLocalRef. Is this a bug or am I doing
> something
> > wrong with this planner rule? I see this section of code for handling a
> > RexCall and I assume that registerInternal returns a RexLocalRef, but
> > visitSubQuery has its own implementation in the RexShuttle which calls
> > visitList on the subquery. Is this a bug or is there another way I should
> > be doing this?
> >
> > Thanks.
> >
> > --Jonathan Sternberg
>
>