[ 
https://issues.apache.org/jira/browse/DRILL-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-6949:
--------------------------------
    Fix Version/s: 1.16.0

> Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition 
> the inner data any further" when Semi join is enabled
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-6949
>                 URL: https://issues.apache.org/jira/browse/DRILL-6949
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.15.0
>            Reporter: Abhishek Ravi
>            Priority: Major
>             Fix For: 1.16.0
>
>         Attachments: 23cc1240-74ff-a0c0-8cd5-938fc136e4e2.sys.drill, 
> 23cc1369-0812-63ce-1861-872636571437.sys.drill
>
>
> Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: 
> Hash-Join can not partition the inner data any further (probably due to too 
> many join-key duplicates)* on TPC-H SF100 data.
> {code:sql}
> set `exec.hashjoin.enable.runtime_filter` = true;
> set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
> set `planner.enable_broadcast_join` = false;
> select
>  count(*)
> from
>  lineitem l1
> where
>  l1.l_discount IN (
>  select
>  distinct(cast(l2.l_discount as double))
>  from
>  lineitem l2);
> reset `exec.hashjoin.enable.runtime_filter`;
> reset `exec.hashjoin.runtime_filter.max.waiting.time`;
> reset `planner.enable_broadcast_join`;
> {code}
> The subquery contains *distinct* keyword and hence there should not be 
> duplicate values. 
> I suspect that the failure is caused by semijoin because the query succeeds 
> when semijoin is disabled explicitly.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to