Mate Szalay-Beko created PHOENIX-6232: -----------------------------------------
Summary: Same query fails with IllegalArgumentException if part of a join Key: PHOENIX-6232 URL: https://issues.apache.org/jira/browse/PHOENIX-6232 Project: Phoenix Issue Type: Bug Reporter: Mate Szalay-Beko We were facing an interesting problem when a more complex query (with inner selects in the WHERE clause) succeeds alone, while the same query fails, if it is part of a join. I created a test table / query to reproduce the problem: {code:sql} DROP TABLE IF EXISTS test; CREATE TABLE test ( id INTEGER NOT NULL, test_id INTEGER, lastchanged TIMESTAMP, CONSTRAINT my_pk PRIMARY KEY (id)); UPSERT INTO test VALUES(0, 100, '2000-01-01 00:00:00.0'); UPSERT INTO test VALUES(1, 101, '2000-01-01 00:00:00.0'); UPSERT INTO test VALUES(2, 100, '2011-11-11 11:11:11.0'); {code} *Query 1:* Example query, running fine in itself: {code:sql} SELECT id, test_id, lastchanged FROM test T WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id ) Returns: +----+---------+-----------------------+ | ID | TEST_ID | LASTCHANGED | +----+---------+-----------------------+ | 1 | 101 | 2000-01-01 01:00:00.0 | | 2 | 100 | 2011-11-11 12:11:11.0 | +----+---------+-----------------------+ {code} *Query 2:* Same query fails on the current master branch, when it is part of a larger (implicit) join: {code:sql} SELECT AAA.* FROM ( SELECT id, test_id, lastchanged FROM test T WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id ) ) as AAA, ( SELECT id FROM test ) as BBB WHERE AAA.id = BBB.id; java.lang.IllegalArgumentException at org.apache.phoenix.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:128) at org.apache.phoenix.compile.TupleProjectionCompiler.createProjectedTable(TupleProjectionCompiler.java:66) at org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:663) at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:404) at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:302) at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:249) at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:176) at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:504) at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:467) at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309) at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:298) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:297) at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:290) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1933) at sqlline.Commands.executeSingleQuery(Commands.java:1054) at sqlline.Commands.execute(Commands.java:1003) at sqlline.Commands.sql(Commands.java:967) at sqlline.SqlLine.dispatch(SqlLine.java:734) at sqlline.SqlLine.begin(SqlLine.java:541) at sqlline.SqlLine.start(SqlLine.java:267) at sqlline.SqlLine.main(SqlLine.java:206) {code} I am not sure what the problem is exactly. My guess is that Phoenix tries to optimize (flatten) an inner-query, which it shouldn't, if we are inside a join (according to the check in the code which throws the exception). The best workaround I found was to define an explicit join in the original query (Query 1), basically change the inner select into a join. This modified query return the same as the original one: *Query 3:* {code:sql} SELECT T.id, T.test_id, T.lastchanged FROM test T LEFT JOIN ( SELECT max(lastchanged) AS max_timestamp, test_id AS max_timestamp_test_id FROM test GROUP BY test_id ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id WHERE T.lastchanged = JOIN_TABLE.max_timestamp Returns: +------+-----------+-----------------------+ | T.ID | T.TEST_ID | T.LASTCHANGED | +------+-----------+-----------------------+ | 1 | 101 | 2000-01-01 01:00:00.0 | | 2 | 100 | 2011-11-11 12:11:11.0 | +------+-----------+-----------------------+ {code} *Query 4:* And the same modified query (query 3) now works inside a join: {code:sql} SELECT AAA.* FROM ( SELECT T.id, T.test_id, T.lastchanged FROM test T LEFT JOIN ( SELECT max(lastchanged) AS max_timestamp, test_id AS max_timestamp_test_id FROM test GROUP BY test_id ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id WHERE T.lastchanged = JOIN_TABLE.max_timestamp ) as AAA, ( SELECT id FROM test ) as BBB WHERE AAA.id = BBB.id; Returns: +------+-----------+-----------------------+ | T.ID | T.TEST_ID | T.LASTCHANGED | +------+-----------+-----------------------+ | 1 | 101 | 2000-01-01 01:00:00.0 | | 2 | 100 | 2011-11-11 12:11:11.0 | +------+-----------+-----------------------+ {code} I think Query 4 worked, as it is forcing Phoenix to drop the idea of optimizing it's inner-query (Query 3). Although, I can be wrong about the root cause... Anyway, I think the bug should be fixed and Query 2 should run without exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)