> On Dec. 12, 2014, 7:45 p.m., Xuefu Zhang wrote:
> > Patch looks good. One suggestion, we should be able to change the static 
> > methods non-static, which would further simplify the code.

I agree. Let me change it.


- Chao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28889/#review64959
-----------------------------------------------------------


On Dec. 11, 2014, 10:36 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28889/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2014, 10:36 p.m.)
> 
> 
> Review request for hive, Szehon Ho and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8911
>     https://issues.apache.org/jira/browse/HIVE-8911
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Basically the idea is to reuse as much code as possible, from MR.
> 
> The issue is that, in MR's MapJoinProcessor, after join op is converted to 
> mapjoin op, all the parents ReduceSinkOperators are removed. However, for our 
> Spark branch, we need to preserve those, because they serve as boundaries 
> between BaseWorks, and SparkReduceSinkMapJoinProc triggers upon them.
> 
> Initially I tried to move this part of logic to SparkMapJoinOptimizer, which 
> happens at a later stage. However, although this works, I'm worried it may 
> have too much affect on the smb join w/ hint, because we then have to move 
> that part of logic to SparkMapJoinOptimizer too. In general, I want to 
> minimize the affect on code path.
> 
> This patch make changes on MapJoinProcessor. I created a separate method 
> convertMapJoinForSpark, which doesn't remove the 
> ReduceSinkOperators, for small tables. Then, in the transform method it 
> decides which method to call based on the execution engine.
> 
> I also have to disable several tests related to smb join w/ hints. They can 
> be activated once HIVE-8640 is resolved.
> 
> 
> Diffs
> -----
> 
>   data/conf/spark/hive-site.xml 44eac86 
>   itests/src/test/resources/testconfiguration.properties 2348e06 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
> 773c827 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java a8a3d86 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkMapJoinProcessor.java 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out f24ae73 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 33e9e8b 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out aaa0151 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 9954b77 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out ad8f0a5 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out aa3e2b6 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin13.q.out 44233f6 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out c4702ef 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 7c31e05 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out a8e892e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out 041ba12 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 54c4be3 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out da9fe1c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 5a5e3f6 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 
> 5ac3f4c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
> e4ff965 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
> fce5566 
>   ql/src/test/results/clientpositive/spark/join25.q.out 284c97d 
>   ql/src/test/results/clientpositive/spark/join26.q.out e271184 
>   ql/src/test/results/clientpositive/spark/join27.q.out d31f29e 
>   ql/src/test/results/clientpositive/spark/join30.q.out 7fbbcfa 
>   ql/src/test/results/clientpositive/spark/join36.q.out f1317ea 
>   ql/src/test/results/clientpositive/spark/join37.q.out 448e983 
>   ql/src/test/results/clientpositive/spark/join38.q.out 735d7ea 
>   ql/src/test/results/clientpositive/spark/join39.q.out 0734d4b 
>   ql/src/test/results/clientpositive/spark/join40.q.out 60ef13d 
>   ql/src/test/results/clientpositive/spark/join_map_ppr.q.out 59fdb99 
>   ql/src/test/results/clientpositive/spark/mapjoin1.q.out 80e38b9 
>   ql/src/test/results/clientpositive/spark/mapjoin_distinct.q.out dc7241c 
>   ql/src/test/results/clientpositive/spark/mapjoin_filter_on_outerjoin.q.out 
> 3b80437 
>   ql/src/test/results/clientpositive/spark/mapjoin_test_outer.q.out fdf8f24 
>   ql/src/test/results/clientpositive/spark/semijoin.q.out 2b8e04b 
>   ql/src/test/results/clientpositive/spark/skewjoin.q.out 56b78be 
> 
> Diff: https://reviews.apache.org/r/28889/diff/
> 
> 
> Testing
> -------
> 
> bucket_map_join_1.q
> bucket_map_join_2.q
> bucketmapjoin1.q
> bucketmapjoin10.q
> bucketmapjoin11.q
> bucketmapjoin12.q
> bucketmapjoin13.q
> bucketmapjoin2.q
> bucketmapjoin3.q
> bucketmapjoin4.q
> bucketmapjoin5.q
> bucketmapjoin7.q
> bucketmapjoin8.q
> bucketmapjoin9.q
> bucketmapjoin_negative.q
> bucketmapjoin_negative2.q
> column_access_stats.q
> join25.q
> join26.q
> join27.q
> join30.q
> join36.q
> join37.q
> join38.q
> join39.q
> join40.q
> join_empty.q
> join_filters_overlap.q
> join_map_ppr.q
> mapjoin1.q
> mapjoin_distinct.q
> mapjoin_filter_onerjoin.q
> mapjoin_hook.q
> mapjoin_tester.q
> semijoin.q
> skewjoin.q
> table_access_keys_stats.q
> 
> 
> Thanks,
> 
> Chao Sun
> 
>

Reply via email to