[ 
https://issues.apache.org/jira/browse/DRILL-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469941#comment-17469941
 ] 

Vova Vysotskyi commented on DRILL-6193:
---------------------------------------

[~dzamo], it was a bug, but I'm not sure whether it is still reproducible. 
Nested loop join is used for the case when we have some specific join 
conditions that cannot be handled by a hash or merge join. One of such 
conditions is {{true}} literal. But similar to this case, we could have issues 
in Drill when during the planning process instead of a highly-performant hash 
join, a nested loop join was chosen, and users will observe bad performance 
because of that.
So disabling it by default helps to discover such issues or warn users that the 
query they are attempting to submit will use NLJ with all consequences.
But please notice that NLJ is prohibited only for the case when there is no 
join input that has a single record to avoid results multiplication or when 
planned has not enough info to detect that.

> Latest Calcite optimized out join condition and cause "This query cannot be 
> planned possibly due to either a cartesian join or an inequality join"
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-6193
>                 URL: https://issues.apache.org/jira/browse/DRILL-6193
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.13.0
>            Reporter: Chunhui Shi
>            Assignee: Hanumath Rao Maduri
>            Priority: Critical
>
> I got the same error on apache master's MapR profile on the tip(before Hive 
> upgrade) and on changeset 9e944c97ee6f6c0d1705f09d531af35deed2e310, the last 
> commit of Calcite upgrade with the failed query reported in functional test 
> but now it is on parquet file:
>  
> {quote}SELECT L.L_QUANTITY, L.L_DISCOUNT, L.L_EXTENDEDPRICE, L.L_TAX
>  
> FROM cp.`tpch/lineitem.parquet` L, cp.`tpch/orders.parquet` O
> WHERE cast(L.L_ORDERKEY as int) = cast(O.O_ORDERKEY as int) AND 
> cast(L.L_LINENUMBER as int) = 7 AND cast(L.L_ORDERKEY as int) = 10208 AND 
> cast(O.O_ORDERKEY as int) = 10208;
>  {quote}
> However, built Drill on commit ef0fafea214e866556fa39c902685d48a56001e1, the 
> commit right before Calcite upgrade commits, the same query worked.
> This was caused by latest Calcite simplified the predicates and during this 
> process, "cast(L.L_ORDERKEY as int) = cast(O.O_ORDERKEY as int) " was 
> considered redundant and was removed, so the logical plan of this query is 
> getting an always true condition for Join:
> {quote}DrillJoinRel(condition=[true], joinType=[inner])
> {quote}
> While in previous version we have 
> {quote}DrillJoinRel(condition=[=($5, $0)], joinType=[inner])
> {quote}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to