[ 
https://issues.apache.org/jira/browse/DRILL-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946913#comment-15946913
 ] 

ASF GitHub Bot commented on DRILL-5375:
---------------------------------------

Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/794
  
    
    Thanks for bringing up this point. I have done some investigation and found 
out that implicit casts for nested loop join are already included during 
materialization.
    Join condition is transformed into FunctionCall [1] which is later on 
materialized using ExpressionTreeMaterializer [2]. 
    `ExpressionTreeMaterializer.visitFunctionCall` includes section which 
implicit casts [3].
    Actually these casts are more enhanced that during hash and merge joins.
    For example, during hash and merge joins only casts between numeric types, 
date and timestamp, varchar and varbinary are supported, i.e.
    join by int and varchar columns won't be performed. The following error 
will be returned: `Join only supports implicit casts between 1. Numeric data
     2. Varchar, Varbinary data 3. Date, Timestamp data Left type: INT, Right 
type: VARCHAR. Add explicit casts to avoid this error`. 
    In our case nested loop join will be able to perform join by int and 
varchar columns without adding explicit casts.
    
    [1] 
https://github.com/arina-ielchiieva/drill/blob/71628e70a525d9bd27b4f5f56259dce84c75154d/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java#L95
    [2] 
https://github.com/arina-ielchiieva/drill/blob/71628e70a525d9bd27b4f5f56259dce84c75154d/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java#L265
    [3] 
https://github.com/apache/drill/blob/9411b26ece34ed8b2f498deea5e41f1901eb1013/exec/java-exec/src/main/java/org/apache/drill/exec/expr/ExpressionTreeMaterializer.java#L362


> Nested loop join: return correct result for left join
> -----------------------------------------------------
>
>                 Key: DRILL-5375
>                 URL: https://issues.apache.org/jira/browse/DRILL-5375
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>              Labels: doc-impacting
>
> Mini repro:
> 1. Create 2 Hive tables with data
> {code}
> CREATE TABLE t1 (
>   FYQ varchar(999),
>   dts varchar(999),
>   dte varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> 2016-Q1,2016-06-01,2016-09-30
> 2016-Q2,2016-09-01,2016-12-31
> 2016-Q3,2017-01-01,2017-03-31
> 2016-Q4,2017-04-01,2017-06-30
> CREATE TABLE t2 (
>   who varchar(999),
>   event varchar(999),
>   dt varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> aperson,did somthing,2017-01-06
> aperson,did somthing else,2017-01-12
> aperson,had chrsitmas,2016-12-26
> aperson,went wild,2016-01-01
> {code}
> 2. Impala Query shows correct result
> {code}
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> +------------+---------+---------+-------------------+
> | dt         | fyq     | who     | event             |
> +------------+---------+---------+-------------------+
> | 2016-01-01 | NULL    | aperson | went wild         |
> | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas     |
> | 2017-01-06 | 2016-Q3 | aperson | did somthing      |
> | 2017-01-12 | 2016-Q3 | aperson | did somthing else |
> +------------+---------+---------+-------------------+
> {code}
> 3. Drill query shows wrong results:
> {code}
> alter session set planner.enable_nljoin_for_scalar_only=false;
> use hive;
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> +-------------+----------+----------+--------------------+
> |     dt      |   fyq    |   who    |       event        |
> +-------------+----------+----------+--------------------+
> | 2016-12-26  | 2016-Q2  | aperson  | had chrsitmas      |
> | 2017-01-06  | 2016-Q3  | aperson  | did somthing       |
> | 2017-01-12  | 2016-Q3  | aperson  | did somthing else  |
> +-------------+----------+----------+--------------------+
> 3 rows selected (2.523 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to