[ https://issues.apache.org/jira/browse/DRILL-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946908#comment-15946908 ]
ASF GitHub Bot commented on DRILL-5375: --------------------------------------- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/794#discussion_r108640631 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java --- @@ -70,27 +70,65 @@ private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(DrillOptiq.class); /** - * Converts a tree of {@link RexNode} operators into a scalar expression in Drill syntax. + * Converts a tree of {@link RexNode} operators into a scalar expression in Drill syntax using one input. + * + * @param context parse context which contains planner settings + * @param input data input + * @param expr expression to be converted + * @return converted expression */ public static LogicalExpression toDrill(DrillParseContext context, RelNode input, RexNode expr) { - final RexToDrill visitor = new RexToDrill(context, input); + return toDrill(context, Lists.newArrayList(input), expr); + } + + /** + * Converts a tree of {@link RexNode} operators into a scalar expression in Drill syntax using multiple inputs. + * + * @param context parse context which contains planner settings + * @param inputs multiple data inputs + * @param expr expression to be converted + * @return converted expression + */ + public static LogicalExpression toDrill(DrillParseContext context, List<RelNode> inputs, RexNode expr) { + final RexToDrill visitor = new RexToDrill(context, inputs); return expr.accept(visitor); } private static class RexToDrill extends RexVisitorImpl<LogicalExpression> { - private final RelNode input; + private final List<RelNode> inputs; private final DrillParseContext context; + private final List<RelDataTypeField> fieldList; - RexToDrill(DrillParseContext context, RelNode input) { + RexToDrill(DrillParseContext context, List<RelNode> inputs) { super(true); this.context = context; - this.input = input; + this.inputs = inputs; + this.fieldList = Lists.newArrayList(); + /* + Fields are enumerated by their presence order in input. Details {@link org.apache.calcite.rex.RexInputRef}. + Thus we can merge field list from several inputs by adding them into the list in order of appearance. + Each field index in the list will match field index in the RexInputRef instance which will allow us + to retrieve field from filed list by index in {@link #visitInputRef(RexInputRef)} method. Example: + + Query: select t1.c1, t2.c1. t2.c2 from t1 inner join t2 on t1.c1 between t2.c1 and t2.c2 + + Input 1: $0 + Input 2: $1, $2 + + Result: $0, $1, $2 + */ + for (RelNode input : inputs) { --- End diff -- Yes, in `public LogicalExpression visitInputRef(RexInputRef inputRef)` we determine to which input field belongs to. Before that we had only one input thus we did simple get operation `input.getRowType().getFieldList().get(index)` but now we have two inputs so we have to get operation on one input and if field in not found try in the second. I could iterate over two inputs and do get operation and once filed is found break the loop OR I could merge filed list in one and do simple get operation `fieldList.get(index)`. For performance reasons, I decided to merge filed lists in constructor and use them in `public LogicalExpression visitInputRef(RexInputRef inputRef)` rather than iterating over them for each field. > Nested loop join: return correct result for left join > ----------------------------------------------------- > > Key: DRILL-5375 > URL: https://issues.apache.org/jira/browse/DRILL-5375 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.8.0 > Reporter: Arina Ielchiieva > Assignee: Arina Ielchiieva > Labels: doc-impacting > > Mini repro: > 1. Create 2 Hive tables with data > {code} > CREATE TABLE t1 ( > FYQ varchar(999), > dts varchar(999), > dte varchar(999) > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; > 2016-Q1,2016-06-01,2016-09-30 > 2016-Q2,2016-09-01,2016-12-31 > 2016-Q3,2017-01-01,2017-03-31 > 2016-Q4,2017-04-01,2017-06-30 > CREATE TABLE t2 ( > who varchar(999), > event varchar(999), > dt varchar(999) > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; > aperson,did somthing,2017-01-06 > aperson,did somthing else,2017-01-12 > aperson,had chrsitmas,2016-12-26 > aperson,went wild,2016-01-01 > {code} > 2. Impala Query shows correct result > {code} > select t2.dt, t1.fyq, t2.who, t2.event > from t2 > left join t1 on t2.dt between t1.dts and t1.dte > order by t2.dt; > +------------+---------+---------+-------------------+ > | dt | fyq | who | event | > +------------+---------+---------+-------------------+ > | 2016-01-01 | NULL | aperson | went wild | > | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas | > | 2017-01-06 | 2016-Q3 | aperson | did somthing | > | 2017-01-12 | 2016-Q3 | aperson | did somthing else | > +------------+---------+---------+-------------------+ > {code} > 3. Drill query shows wrong results: > {code} > alter session set planner.enable_nljoin_for_scalar_only=false; > use hive; > select t2.dt, t1.fyq, t2.who, t2.event > from t2 > left join t1 on t2.dt between t1.dts and t1.dte > order by t2.dt; > +-------------+----------+----------+--------------------+ > | dt | fyq | who | event | > +-------------+----------+----------+--------------------+ > | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas | > | 2017-01-06 | 2016-Q3 | aperson | did somthing | > | 2017-01-12 | 2016-Q3 | aperson | did somthing else | > +-------------+----------+----------+--------------------+ > 3 rows selected (2.523 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)