Re: Review Request 28889: HIVE-8911 - Enable mapjoin hints [Spark Branch]

2014-12-12 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/#review64959
---


Patch looks good. One suggestion, we should be able to change the static 
methods non-static, which would further simplify the code.


ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java
https://reviews.apache.org/r/28889/#comment107797

nit: grandParentOps.get(0) is repeated in the next line. nice to have a var 
for it.


- Xuefu Zhang


On Dec. 11, 2014, 10:36 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28889/
 ---
 
 (Updated Dec. 11, 2014, 10:36 p.m.)
 
 
 Review request for hive, Szehon Ho and Xuefu Zhang.
 
 
 Bugs: HIVE-8911
 https://issues.apache.org/jira/browse/HIVE-8911
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Basically the idea is to reuse as much code as possible, from MR.
 
 The issue is that, in MR's MapJoinProcessor, after join op is converted to 
 mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
 Spark branch, we need to preserve those, because they serve as boundaries 
 between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.
 
 Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
 happens at a later stage. However, although this works, I'm worried it may 
 have too much affect on the smb join w/ hint, because we then have to move 
 that part of logic to SparkMapJoinOptimizer too. In general, I want to 
 minimize the affect on code path.
 
 This patch make changes on MapJoinProcessor. I created a separate method 
 convertMapJoinForSpark, which doesn't remove the 
 ReduceSinkOperators, for small tables. Then, in the transform method it 
 decides which method to call based on the execution engine.
 
 I also have to disable several tests related to smb join w/ hints. They can 
 be activated once HIVE-8640 is resolved.
 
 
 Diffs
 -
 
   data/conf/spark/hive-site.xml 44eac86 
   itests/src/test/resources/testconfiguration.properties 2348e06 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 773c827 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 
 5ac3f4c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
 e4ff965 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
 fce5566 
   ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
   ql/src/test/results/clientpositive/spark/join26.q.out e271184 
   ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
   ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
   ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
   ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
   ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
   ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
   ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
   ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
   ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
   ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
   ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
 3b80437 
   ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
   ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
   ql/src/test/results/clientpositive/spark/skewjoin.q.out 56b78be 
 
 Diff: 

Re: Review Request 28889: HIVE-8911 - Enable mapjoin hints [Spark Branch]

2014-12-12 Thread Chao Sun


 On Dec. 12, 2014, 7:45 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java, 
  line 78
  https://reviews.apache.org/r/28889/diff/2/?file=789801#file789801line78
 
  nit: grandParentOps.get(0) is repeated in the next line. nice to have a 
  var for it.

Sure. Will fix.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/#review64959
---


On Dec. 11, 2014, 10:36 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28889/
 ---
 
 (Updated Dec. 11, 2014, 10:36 p.m.)
 
 
 Review request for hive, Szehon Ho and Xuefu Zhang.
 
 
 Bugs: HIVE-8911
 https://issues.apache.org/jira/browse/HIVE-8911
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Basically the idea is to reuse as much code as possible, from MR.
 
 The issue is that, in MR's MapJoinProcessor, after join op is converted to 
 mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
 Spark branch, we need to preserve those, because they serve as boundaries 
 between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.
 
 Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
 happens at a later stage. However, although this works, I'm worried it may 
 have too much affect on the smb join w/ hint, because we then have to move 
 that part of logic to SparkMapJoinOptimizer too. In general, I want to 
 minimize the affect on code path.
 
 This patch make changes on MapJoinProcessor. I created a separate method 
 convertMapJoinForSpark, which doesn't remove the 
 ReduceSinkOperators, for small tables. Then, in the transform method it 
 decides which method to call based on the execution engine.
 
 I also have to disable several tests related to smb join w/ hints. They can 
 be activated once HIVE-8640 is resolved.
 
 
 Diffs
 -
 
   data/conf/spark/hive-site.xml 44eac86 
   itests/src/test/resources/testconfiguration.properties 2348e06 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 773c827 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 
 5ac3f4c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
 e4ff965 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
 fce5566 
   ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
   ql/src/test/results/clientpositive/spark/join26.q.out e271184 
   ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
   ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
   ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
   ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
   ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
   ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
   ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
   ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
   ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
   ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
   ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
 3b80437 
   ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
   ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
   ql/src/test/results/clientpositive/spark/skewjoin.q.out 56b78be 
 
 Diff: 

Re: Review Request 28889: HIVE-8911 - Enable mapjoin hints [Spark Branch]

2014-12-12 Thread Chao Sun


 On Dec. 12, 2014, 7:45 p.m., Xuefu Zhang wrote:
  Patch looks good. One suggestion, we should be able to change the static 
  methods non-static, which would further simplify the code.

I agree. Let me change it.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/#review64959
---


On Dec. 11, 2014, 10:36 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28889/
 ---
 
 (Updated Dec. 11, 2014, 10:36 p.m.)
 
 
 Review request for hive, Szehon Ho and Xuefu Zhang.
 
 
 Bugs: HIVE-8911
 https://issues.apache.org/jira/browse/HIVE-8911
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Basically the idea is to reuse as much code as possible, from MR.
 
 The issue is that, in MR's MapJoinProcessor, after join op is converted to 
 mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
 Spark branch, we need to preserve those, because they serve as boundaries 
 between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.
 
 Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
 happens at a later stage. However, although this works, I'm worried it may 
 have too much affect on the smb join w/ hint, because we then have to move 
 that part of logic to SparkMapJoinOptimizer too. In general, I want to 
 minimize the affect on code path.
 
 This patch make changes on MapJoinProcessor. I created a separate method 
 convertMapJoinForSpark, which doesn't remove the 
 ReduceSinkOperators, for small tables. Then, in the transform method it 
 decides which method to call based on the execution engine.
 
 I also have to disable several tests related to smb join w/ hints. They can 
 be activated once HIVE-8640 is resolved.
 
 
 Diffs
 -
 
   data/conf/spark/hive-site.xml 44eac86 
   itests/src/test/resources/testconfiguration.properties 2348e06 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
 773c827 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 
 5ac3f4c 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
 e4ff965 
   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
 fce5566 
   ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
   ql/src/test/results/clientpositive/spark/join26.q.out e271184 
   ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
   ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
   ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
   ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
   ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
   ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
   ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
   ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
   ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
   ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
   ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
 3b80437 
   ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
   ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
   ql/src/test/results/clientpositive/spark/skewjoin.q.out 56b78be 
 
 Diff: https://reviews.apache.org/r/28889/diff/
 
 
 Testing
 ---
 
 bucket_map_join_1.q
 bucket_map_join_2.q
 bucketmapjoin1.q
 bucketmapjoin10.q
 

Re: Review Request 28889: HIVE-8911 - Enable mapjoin hints [Spark Branch]

2014-12-12 Thread Chao Sun

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/
---

(Updated Dec. 13, 2014, 12:24 a.m.)


Review request for hive, Szehon Ho and Xuefu Zhang.


Changes
---

Made MapJoin#convertMapJoin to be non-static.


Bugs: HIVE-8911
https://issues.apache.org/jira/browse/HIVE-8911


Repository: hive-git


Description
---

Basically the idea is to reuse as much code as possible, from MR.

The issue is that, in MR's MapJoinProcessor, after join op is converted to 
mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
Spark branch, we need to preserve those, because they serve as boundaries 
between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.

Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
happens at a later stage. However, although this works, I'm worried it may have 
too much affect on the smb join w/ hint, because we then have to move that part 
of logic to SparkMapJoinOptimizer too. In general, I want to minimize the 
affect on code path.

This patch make changes on MapJoinProcessor. I created a separate method 
convertMapJoinForSpark, which doesn't remove the 
ReduceSinkOperators, for small tables. Then, in the transform method it decides 
which method to call based on the execution engine.

I also have to disable several tests related to smb join w/ hints. They can be 
activated once HIVE-8640 is resolved.


Diffs (updated)
-

  data/conf/spark/hive-site.xml 44eac86 
  itests/src/test/resources/testconfiguration.properties 2348e06 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java 
c9e8086 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 417d53f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
  ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
  ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
  ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
  ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
  ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
  ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
  ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
  ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
  ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 5ac3f4c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
e4ff965 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
fce5566 
  ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
  ql/src/test/results/clientpositive/spark/join26.q.out e271184 
  ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
  ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
  ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
  ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
  ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
  ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
  ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
  ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
  ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
  ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
  ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
3b80437 
  ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
  ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
  ql/src/test/results/clientpositive/spark/skewjoin.q.out 37b5ee5 

Diff: https://reviews.apache.org/r/28889/diff/


Testing
---

bucket_map_join_1.q
bucket_map_join_2.q
bucketmapjoin1.q
bucketmapjoin10.q
bucketmapjoin11.q
bucketmapjoin12.q
bucketmapjoin13.q
bucketmapjoin2.q
bucketmapjoin3.q
bucketmapjoin4.q
bucketmapjoin5.q
bucketmapjoin7.q
bucketmapjoin8.q
bucketmapjoin9.q
bucketmapjoin_negative.q
bucketmapjoin_negative2.q
column_access_stats.q
join25.q
join26.q
join27.q
join30.q
join36.q
join37.q
join38.q
join39.q
join40.q
join_empty.q
join_filters_overlap.q
join_map_ppr.q
mapjoin1.q
mapjoin_distinct.q
mapjoin_filter_onerjoin.q

Re: Review Request 28889: HIVE-8911 - Enable mapjoin hints [Spark Branch]

2014-12-11 Thread Chao Sun

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/
---

(Updated Dec. 11, 2014, 10:36 p.m.)


Review request for hive, Szehon Ho and Xuefu Zhang.


Changes
---

Updated the patch by making a SparkMapJoinProcessor, which overwrites some of 
the functionalities in MapJoinProcessor.

It's a shame that we couldn't override convertMapJoin, so 
SparkMapJoinProcessor#generateMapJoinOperator ends up being a duplicate of 
MapJoinProcessor#generateMapJoinOperator


Bugs: HIVE-8911
https://issues.apache.org/jira/browse/HIVE-8911


Repository: hive-git


Description
---

Basically the idea is to reuse as much code as possible, from MR.

The issue is that, in MR's MapJoinProcessor, after join op is converted to 
mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
Spark branch, we need to preserve those, because they serve as boundaries 
between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.

Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
happens at a later stage. However, although this works, I'm worried it may have 
too much affect on the smb join w/ hint, because we then have to move that part 
of logic to SparkMapJoinOptimizer too. In general, I want to minimize the 
affect on code path.

This patch make changes on MapJoinProcessor. I created a separate method 
convertMapJoinForSpark, which doesn't remove the 
ReduceSinkOperators, for small tables. Then, in the transform method it decides 
which method to call based on the execution engine.

I also have to disable several tests related to smb join w/ hints. They can be 
activated once HIVE-8640 is resolved.


Diffs (updated)
-

  data/conf/spark/hive-site.xml 44eac86 
  itests/src/test/resources/testconfiguration.properties 2348e06 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 773c827 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
  ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
  ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
  ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
  ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
  ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
  ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
  ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
  ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
  ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 5ac3f4c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
e4ff965 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
fce5566 
  ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
  ql/src/test/results/clientpositive/spark/join26.q.out e271184 
  ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
  ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
  ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
  ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
  ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
  ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
  ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
  ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
  ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
  ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
  ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
3b80437 
  ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
  ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
  ql/src/test/results/clientpositive/spark/skewjoin.q.out 56b78be 

Diff: https://reviews.apache.org/r/28889/diff/


Testing
---

bucket_map_join_1.q
bucket_map_join_2.q
bucketmapjoin1.q
bucketmapjoin10.q
bucketmapjoin11.q
bucketmapjoin12.q
bucketmapjoin13.q
bucketmapjoin2.q
bucketmapjoin3.q
bucketmapjoin4.q
bucketmapjoin5.q
bucketmapjoin7.q
bucketmapjoin8.q
bucketmapjoin9.q
bucketmapjoin_negative.q
bucketmapjoin_negative2.q
column_access_stats.q
join25.q
join26.q
join27.q
join30.q