[ https://issues.apache.org/jira/browse/DRILL-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533690#comment-14533690 ]
Victoria Markman commented on DRILL-2988: ----------------------------------------- There is one more case that is similar to this one, with the difference that we throw and error in this case: Notice, that columns in join condition are not qualified: {code} 0: jdbc:drill:schema=dfs> select * from t1 where a1 > (select avg(a2)*100 from t2 where c1 = c2); Query failed: SYSTEM ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 [991f5544-d44f-4ba2-aa70-783a88a5b16a on atsqa4-133.qa.lab:31010] Error: exception while executing query: Failure while executing query. (state=,code=0) Correlated scalar aggregation without expression: 0: jdbc:drill:schema=dfs> select * from t1 where a1 > (select avg(a2) from t2 where c1 = c2); Query failed: SYSTEM ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join [f238c0c0-efea-4684-8ec3-899fbcc2a075 on atsqa4-133.qa.lab:31010] Error: exception while executing query: Failure while executing query. (state=,code=0) {code} > Correlated exists subquery returns wrong result if join columns in subquery > are not fully qualified > --------------------------------------------------------------------------------------------------- > > Key: DRILL-2988 > URL: https://issues.apache.org/jira/browse/DRILL-2988 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.0.0 > Reporter: Victoria Markman > Assignee: Jinfeng Ni > Labels: document_if_not_fixed > > Two parquet tables: > {code} > 0: jdbc:drill:schema=dfs> select * from t1; > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > | 4 | null | 2015-01-04 | > | 5 | eeeee | 2015-01-05 | > | 6 | fffff | 2015-01-06 | > | 7 | ggggg | 2015-01-07 | > | null | hhhhh | 2015-01-08 | > | 9 | iiiii | null | > | 10 | jjjjj | 2015-01-10 | > +------------+------------+------------+ > 10 rows selected (0.109 seconds) > 0: jdbc:drill:schema=dfs> select * from t3; > +------------+------------+------------+ > | a3 | b3 | c3 | > +------------+------------+------------+ > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > +------------+------------+------------+ > 3 rows selected (0.106 seconds) > {code} > When column name is not qualified with the table name in the join condition, > we get wrong result: > {code} > 0: jdbc:drill:schema=dfs> select * from t1 where exists (select * from t3 > where a1 = a3); > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > | 4 | null | 2015-01-04 | > | 5 | eeeee | 2015-01-05 | > | 6 | fffff | 2015-01-06 | > | 7 | ggggg | 2015-01-07 | > | null | hhhhh | 2015-01-08 | > | 9 | iiiii | null | > | 10 | jjjjj | 2015-01-10 | > +------------+------------+------------+ > 10 rows selected (0.177 seconds) > {code} > Works correctly when column names are fully qualified ... > {code} > 0: jdbc:drill:schema=dfs> select * from t1 where exists (select * from t3 > where t1.a1 = t3.a3); > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > +------------+------------+------------+ > 3 rows selected (0.353 seconds) > {code} > Plan for the query with the wrong result: > {code} > 0: jdbc:drill:schema=dfs> explain plan for select * from t1 where exists > (select * from t3 where a1 = a3); > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(*=[$0]) > 00-02 Project(T124¦¦*=[$0]) > 00-03 NestedLoopJoin(condition=[true], joinType=[inner]) > 00-05 Project(T124¦¦*=[$0]) > 00-07 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]], > selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`*`]]]) > 00-04 SelectionVectorRemover > 00-06 Filter(condition=[IS NOT NULL($0)]) > 00-08 StreamAgg(group=[{}], agg#0=[MIN($0)]) > 00-09 Project($f0=[true]) > 00-10 SelectionVectorRemover > 00-11 Filter(condition=[=($0, $1)]) > 00-12 Project(a1=[$1], a3=[$0]) > 00-13 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t3]], > selectionRoot=/drill/testdata/aggregation/t3, numFiles=1, columns=[`a1`, > `a3`]]]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)