-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/
-----------------------------------------------------------
Review request for hive and Brock Noland.
Bugs: HIVE-7815
https://issues.apache.org/jira/browse/HIVE-7815
Repository: hive-git
Description
-------
This is the first part of the reduce-side join work. See HIVE-7384 for the
overall design doc.
This patch inserts a UnionTran after the two join inputs, and thus leverages
the Union-all code path to run the Spark RDD. I also made the following
changes:
1. Some API cleanup of GraphTran. Connect will automatically add the child,
so no need for multiple calls.
2. Fix a bug in HiveBaseReduceFunction. HIVE-7652 made the iterator return
false after close if there's more rows, so Spark calls hasNext again and close
thus gets called twice. CommonJoinOperator throws exception if close gets
called more than once. So adding a check there.
Diffs
-----
itests/src/test/resources/testconfiguration.properties 63af01d
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
6568a76
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java
d16f1be
ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION
ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION
ql/src/test/results/clientpositive/spark/join_casesensitive.q.out
PRE-CREATION
Diff: https://reviews.apache.org/r/24919/diff/
Testing
-------
Added three join tests to the TestSparkCliDriver suite.
Thanks,
Szehon Ho