[ https://issues.apache.org/jira/browse/DRILL-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jinfeng Ni updated DRILL-2454: ------------------------------ Assignee: Aman Sinha (was: Jinfeng Ni) > Support for local join filter in outer joins > -------------------------------------------- > > Key: DRILL-2454 > URL: https://issues.apache.org/jira/browse/DRILL-2454 > Project: Apache Drill > Issue Type: Improvement > Components: Query Planning & Optimization > Reporter: Victoria Markman > Assignee: Aman Sinha > Fix For: 1.2.0 > > > Currently in the case of outer join, we can't have local join filter on the > preserving side of the outer join. "Can not plan" exception will be thrown. > Below you will find examples of queries that are going to fail: > -- Join filter on the left side of the left outer join > {code:sql} > select > count(*) > from > cp.`tpch/orders.parquet` o > left outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and o.o_orderkey = 1); > {code} > -- Join filter on the right side of the right outer join > {code:sql} > select > count(*) > from > cp.`tpch/lineitem.parquet` l > right outer join > cp.`tpch/orders.parquet` o > on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1) > where > o.o_orderkey = 1; > {code} > Full outer join with the join filter on the left side: > {code:sql} > select > count(*) > from > cp.`tpch/orders.parquet` o > full outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1) > ; > {code} > Full outer join with the join filter on the right side: > {code:sql} > select > count(*) > from > cp.`tpch/orders.parquet` o > full outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1) > ; > {code} > The only case that will currently work is an inner join: > {code:sql} > select > count(*) > from > cp.`tpch/orders.parquet` o > inner join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1) > ; > {code} > Or outer join with the local join filter on "non-preserving side" of the > outer join: > {code:sql} > select > count(*) > from > cp.`tpch/orders.parquet` o > left outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1) > ; > {code} > As a reminder, queries with filter in the ON clause of the join condition are > semantically different from those with the same condition in the WHERE clause. > This query will return an order and its lines, if any, for order number =100: > {code:sql} > select * > from > cp.`tpch/orders.parquet` o > left outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey) > where o.o_orderkey = 100 > ; > {code} > This query will return ALL orders, but only order 100 will have any lines > associated with it: > {code:sql} > select * > from > cp.`tpch/orders.parquet` o > left outer join > cp.`tpch/lineitem.parquet` l > on (o.o_orderkey = l.l_orderkey and o.o_orderkey = 100) > ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)