[ https://issues.apache.org/jira/browse/DRILL-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Hsuan-Yi Chu resolved DRILL-2967. -------------------------------------- Resolution: Cannot Reproduce > Incompatible types error reported in a "not in" query with compatible data > types > --------------------------------------------------------------------------------- > > Key: DRILL-2967 > URL: https://issues.apache.org/jira/browse/DRILL-2967 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 0.9.0 > Reporter: Victoria Markman > Assignee: Sean Hsuan-Yi Chu > Fix For: 1.2.0 > > Attachments: j2_j6_tables.tar, t1.parquet, t2.parquet > > > Two tables, parquet files (attached in the bug): > {code} > 0: jdbc:drill:schema=dfs> select * from t1; > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > | 4 | null | 2015-01-04 | > | 5 | eeeee | 2015-01-05 | > | 6 | fffff | 2015-01-06 | > | 7 | ggggg | 2015-01-07 | > | null | hhhhh | 2015-01-08 | > | 9 | iiiii | null | > | 10 | jjjjj | 2015-01-10 | > +------------+------------+------------+ > 10 rows selected (0.119 seconds) > 0: jdbc:drill:schema=dfs> select * from t2; > +------------+------------+------------+ > | a2 | b2 | c2 | > +------------+------------+------------+ > | 0 | zzz | 2014-12-31 | > | 1 | aaaaa | 2015-01-01 | > | 2 | bbbbb | 2015-01-02 | > | 2 | bbbbb | 2015-01-02 | > | 2 | bbbbb | 2015-01-02 | > | 3 | ccccc | 2015-01-03 | > | 4 | ddddd | 2015-01-04 | > | 5 | eeeee | 2015-01-05 | > | 6 | fffff | 2015-01-06 | > | 7 | ggggg | 2015-01-07 | > | 7 | ggggg | 2015-01-07 | > | 8 | hhhhh | 2015-01-08 | > | 9 | iiiii | 2015-01-09 | > +------------+------------+------------+ > 13 rows selected (0.116 seconds) > {code} > Disable hash join and set slice_target = 1: > alter session set `planner.enable_hashjoin` = false; > alter session set `planner.slice_target` = 1; > Correct result: > {code} > 0: jdbc:drill:schema=dfs> select * from t1 where b1 not in (select b2 from > t2); > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > | 10 | jjjjj | 2015-01-10 | > +------------+------------+------------+ > 1 row selected (0.625 seconds) > {code} > Swap tables and you get an error: > {code} > 0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from > t1); > +------------+------------+------------+ > | a1 | b1 | c1 | > +------------+------------+------------+ > Query failed: SYSTEM ERROR: Join only supports implicit casts between 1. > Numeric data > 2. Varchar, Varbinary data Left type: INT, Right type: VARCHAR. Add explicit > casts to avoid this error > Fragment 1:0 > [1a83aa50-39aa-452c-91dd-970bf4a8f03d on atsqa4-133.qa.lab:31010] > java.lang.RuntimeException: java.sql.SQLException: Failure while executing > query. > at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514) > at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148) > at sqlline.SqlLine.print(SqlLine.java:1809) > at sqlline.SqlLine$Commands.execute(SqlLine.java:3766) > at sqlline.SqlLine$Commands.sql(SqlLine.java:3663) > at sqlline.SqlLine.dispatch(SqlLine.java:889) > at sqlline.SqlLine.begin(SqlLine.java:763) > at sqlline.SqlLine.start(SqlLine.java:498) > at sqlline.SqlLine.main(SqlLine.java:460) > {code} > Explain plan for the query with an error: > {code} > 0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in > (select b1 from t1); > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(*=[$0]) > 00-02 UnionExchange > 01-01 Project(T27¦¦*=[$0]) > 01-02 SelectionVectorRemover > 01-03 Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), true, IS > NULL($4), null, <($3, $2), null, false))]) > 01-04 MergeJoin(condition=[=($4, $5)], joinType=[left]) > 01-06 SelectionVectorRemover > 01-08 Sort(sort0=[$4], dir0=[ASC]) > 01-10 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4]) > 01-12 HashToRandomExchange(dist0=[[$4]]) > 02-01 UnorderedMuxExchange > 04-01 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$4], > E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($4))]) > 04-02 Project(T27¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], b20=[$1]) > 04-03 NestedLoopJoin(condition=[true], joinType=[inner]) > 04-05 Project(T27¦¦*=[$0], b2=[$1]) > 04-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:/drill/testdata/aggregation/t2]], > selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]]) > 04-04 BroadcastExchange > 06-01 StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)]) > 06-02 UnionExchange > 07-01 StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) > 07-02 Project(b1=[$0], $f1=[true]) > 07-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:/drill/testdata/aggregation/t1]], > selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) > 01-05 Project(b1=[$0], $f10=[$1]) > 01-07 SelectionVectorRemover > 01-09 Sort(sort0=[$0], dir0=[ASC]) > 01-11 HashAgg(group=[{0}], agg#0=[MIN($1)]) > 01-13 Project(b1=[$0], $f1=[$1]) > 01-14 HashToRandomExchange(dist0=[[$0]]) > 03-01 UnorderedMuxExchange > 05-01 Project(b1=[$0], $f1=[$1], > E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) > 05-02 HashAgg(group=[{0}], agg#0=[MIN($1)]) > 05-03 Project(b1=[$0], $f1=[true]) > 05-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:/drill/testdata/aggregation/t1]], > selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) > {code} > Correct result and correct plan with hash join distributed plan ( > planner.slice_target = 1) > alter session set `planner.enable_hashjoin` = true; > alter session set `planner.slice_target` = 1; > {code} > 0: jdbc:drill:schema=dfs> select * from t2 where b2 not in (select b1 from > t1); > +------------+------------+------------+ > | a2 | b2 | c2 | > +------------+------------+------------+ > +------------+------------+------------+ > No rows selected (0.458 seconds) > 0: jdbc:drill:schema=dfs> explain plan for select * from t2 where b2 not in > (select b1 from t1); > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(*=[$0]) > 00-02 Project(T25¦¦*=[$0]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[NOT(CASE(=($2, 0), false, IS NOT NULL($6), > true, IS NULL($4), null, <($3, $2), null, false))]) > 00-05 HashJoin(condition=[=($4, $5)], joinType=[left]) > 00-07 Project(T25¦¦*=[$0], b2=[$1], $f0=[$2], $f1=[$3], > b20=[$1]) > 00-09 NestedLoopJoin(condition=[true], joinType=[inner]) > 00-11 Project(T25¦¦*=[$0], b2=[$1]) > 00-12 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t2]], > selectionRoot=/drill/testdata/aggregation/t2, numFiles=1, columns=[`*`]]]) > 00-10 BroadcastExchange > 01-01 StreamAgg(group=[{}], agg#0=[$SUM0($0)], > agg#1=[$SUM0($1)]) > 01-02 UnionExchange > 03-01 StreamAgg(group=[{}], agg#0=[COUNT()], > agg#1=[COUNT($0)]) > 03-02 Project(b1=[$0], $f1=[true]) > 03-03 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]], > selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) > 00-06 Project(b1=[$0], $f10=[$1]) > 00-08 BroadcastExchange > 02-01 HashAgg(group=[{0}], agg#0=[MIN($1)]) > 02-02 Project(b1=[$0], $f1=[$1]) > 02-03 HashToRandomExchange(dist0=[[$0]]) > 04-01 UnorderedMuxExchange > 05-01 Project(b1=[$0], $f1=[$1], > E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) > 05-02 HashAgg(group=[{0}], agg#0=[MIN($1)]) > 05-03 Project(b1=[$0], $f1=[true]) > 05-04 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]], > selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`b1`]]]) > {code} > Same error with the columns of date, time and timestamp types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)