-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/
-----------------------------------------------------------
(Updated Dec. 13, 2014, 12:24 a.m.)
Review request for hive, Szehon Ho and Xuefu Zhang.
Changes
-------
Made MapJoin#convertMapJoin to be non-static.
Bugs: HIVE-8911
https://issues.apache.org/jira/browse/HIVE-8911
Repository: hive-git
Description
-------
Basically the idea is to reuse as much code as possible, from MR.
The issue is that, in MR's MapJoinProcessor, after join op is converted to
mapjoin op, all the parents ReduceSinkOperators are removed. However, for our
Spark branch, we need to preserve those, because they serve as boundaries
between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.
Initially I tried to move this part of logic to SparkMapJoinOptimizer, which
happens at a later stage. However, although this works, I'm worried it may have
too much affect on the smb join w/ hint, because we then have to move that part
of logic to SparkMapJoinOptimizer too. In general, I want to minimize the
affect on code path.
This patch make changes on MapJoinProcessor. I created a separate method
convertMapJoinForSpark, which doesn't remove the
ReduceSinkOperators, for small tables. Then, in the transform method it decides
which method to call based on the execution engine.
I also have to disable several tests related to smb join w/ hints. They can be
activated once HIVE-8640 is resolved.
Diffs (updated)
-----
data/conf/spark/hive-site.xml 44eac86
itests/src/test/resources/testconfiguration.properties 2348e06
ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java
c9e8086
ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 417d53f
ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java
PRE-CREATION
ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73
ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b
ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151
ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77
ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5
ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6
ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6
ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef
ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05
ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e
ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12
ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3
ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c
ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6
ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 5ac3f4c
ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out
e4ff965
ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out
fce5566
ql/src/test/results/clientpositive/spark/join25.q.out 284c97d
ql/src/test/results/clientpositive/spark/join26.q.out e271184
ql/src/test/results/clientpositive/spark/join27.q.out d31f29e
ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa
ql/src/test/results/clientpositive/spark/join36.q.out f1317ea
ql/src/test/results/clientpositive/spark/join37.q.out 448e983
ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea
ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b
ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d
ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99
ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9
ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c
ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out
3b80437
ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24
ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b
ql/src/test/results/clientpositive/spark/skewjoin.q.out 37b5ee5
Diff: https://reviews.apache.org/r/28889/diff/
Testing
-------
bucket_map_join_1.q
bucket_map_join_2.q
bucketmapjoin1.q
bucketmapjoin10.q
bucketmapjoin11.q
bucketmapjoin12.q
bucketmapjoin13.q
bucketmapjoin2.q
bucketmapjoin3.q
bucketmapjoin4.q
bucketmapjoin5.q
bucketmapjoin7.q
bucketmapjoin8.q
bucketmapjoin9.q
bucketmapjoin_negative.q
bucketmapjoin_negative2.q
column_access_stats.q
join25.q
join26.q
join27.q
join30.q
join36.q
join37.q
join38.q
join39.q
join40.q
join_empty.q
join_filters_overlap.q
join_map_ppr.q
mapjoin1.q
mapjoin_distinct.q
mapjoin_filter_onerjoin.q
mapjoin_hook.q
mapjoin_tester.q
semijoin.q
skewjoin.q
table_access_keys_stats.q
Thanks,
Chao Sun