[
https://issues.apache.org/jira/browse/DRILL-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacques Nadeau updated DRILL-886:
---------------------------------
Assignee: Suresh Ollala (was: Aman Sinha)
> Wrong results for a query with Right Outer Join on the second (and
> subsequent) executions
> -----------------------------------------------------------------------------------------
>
> Key: DRILL-886
> URL: https://issues.apache.org/jira/browse/DRILL-886
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Aman Sinha
> Assignee: Suresh Ollala
> Priority: Critical
> Fix For: 0.5.0
>
>
> The following query with a right outer join produces correct results on the
> first execution in a session but wrong results on the second and subsequent
> executions. A potential cause for the problem can be seen from the two
> Explain plans: the scan of the nation table shows a difference in the
> columns being projected.
> 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from
> cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on
> n.n_regionkey = r.r_regionkey;
> +-------------+-------------+
> | n_regionkey | r_regionkey |
> +-------------+-------------+
> | 0 | 0 |
> | 0 | 0 |
> | 0 | 0 |
> | 0 | 0 |
> | 0 | 0 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 2 | 2 |
> | 2 | 2 |
> | 2 | 2 |
> | 2 | 2 |
> | 2 | 2 |
> | 3 | 3 |
> | 3 | 3 |
> | 3 | 3 |
> | 3 | 3 |
> | 3 | 3 |
> | 4 | 4 |
> | 4 | 4 |
> | 4 | 4 |
> | 4 | 4 |
> | 4 | 4 |
> +-------------+-------------+
> 25 rows selected (2.207 seconds)
> 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from
> cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on
> n.n_regionkey = r.r_regionkey;
> +-------------+-------------+
> | n_regionkey | r_regionkey |
> +-------------+-------------+
> | 0 | null |
> | 1 | null |
> | 1 | null |
> | 1 | null |
> | 4 | null |
> | 0 | null |
> | 3 | null |
> | 3 | null |
> | 2 | null |
> | 2 | null |
> | 4 | null |
> | 4 | null |
> | 2 | null |
> | 4 | null |
> | 0 | null |
> | 0 | null |
> | 0 | null |
> | 1 | null |
> | 2 | null |
> | 3 | null |
> | 4 | null |
> | 2 | null |
> | 3 | null |
> | 3 | null |
> | 1 | null |
> +-------------+-------------+
> 25 rows selected (0.514 seconds)
> EXPLAIN plan for the good run:
> | 00-00 Screen
> 00-01 Project(n_regionkey=[$0], r_regionkey=[$1])
> 00-02 Project(n_regionkey=[$3], r_regionkey=[$1])
> 00-03 HashJoin(condition=[=($3, $1)], joinType=[right])
> 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet,
> columns=[SchemaPath [`r_regionkey`]]]])
> 00-04 Project(*0=[$0], n_regionkey=[$1])
> 00-06 BroadcastExchange
> 01-01 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]],
> selectionRoot=/tpch/nation.parquet, columns=[SchemaPath [`n_regionkey`]]]])
> Explain plan for the bad run:
> | 00-00 Screen
> 00-01 Project(n_regionkey=[$0], r_regionkey=[$1])
> 00-02 Project(n_regionkey=[$3], r_regionkey=[$1])
> 00-03 HashJoin(condition=[=($2, $1)], joinType=[right])
> 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet,
> columns=[SchemaPath [`r_regionkey`]]]])
> 00-04 Project(*0=[$0], n_regionkey=[$1])
> 00-06 BroadcastExchange
> 01-01 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]],
> selectionRoot=/tpch/nation.parquet, columns=null]])
--
This message was sent by Atlassian JIRA
(v6.2#6252)