[ https://issues.apache.org/jira/browse/DRILL-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736524#comment-16736524 ]
Abhishek Ravi commented on DRILL-6949: -------------------------------------- [^23cc1369-0812-63ce-1861-872636571437.sys.drill] Is the profile for the failed query. > Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition > the inner data any further" when Semi join is enabled > -------------------------------------------------------------------------------------------------------------------------------- > > Key: DRILL-6949 > URL: https://issues.apache.org/jira/browse/DRILL-6949 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.15.0 > Reporter: Abhishek Ravi > Priority: Major > Attachments: 23cc1369-0812-63ce-1861-872636571437.sys.drill > > > Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: > Hash-Join can not partition the inner data any further (probably due to too > many join-key duplicates)* on TPC-H SF100 data. > {code:sql} > set `exec.hashjoin.enable.runtime_filter` = true; > set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000; > set `planner.enable_broadcast_join` = false; > select > count(*) > from > lineitem l1 > where > l1.l_discount IN ( > select > distinct(cast(l2.l_discount as double)) > from > lineitem l2); > reset `exec.hashjoin.enable.runtime_filter`; > reset `exec.hashjoin.runtime_filter.max.waiting.time`; > reset `planner.enable_broadcast_join`; > {code} > The subquery contains *distinct* keyword and hence there should not be > duplicate values. > I suspect that the failure is caused by semijoin because the query succeeds > when semijoin is disabled explicitly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)