-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26706/
-----------------------------------------------------------
(Updated Oct. 20, 2014, 10:04 p.m.)
Review request for hive and Xuefu Zhang.
Bugs: HIVE-8436
https://issues.apache.org/jira/browse/HIVE-8436
Repository: hive-git
Description
-------
Based on the design doc, we need to split the operator tree of a work in
SparkWork if the work is connected to multiple child works. The way splitting
the operator tree is performed by cloning the original work and removing
unwanted branches in the operator tree. Please refer to the design doc for
details.
This process should be done right before we generate SparkPlan. We should have
a utility method that takes the orignal SparkWork and return a modified
SparkWork.
This process should also keep the information about the original work and its
clones. Such information will be needed during SparkPlan generation (HIVE-8437).
Diffs (updated)
-----
itests/src/test/resources/testconfiguration.properties 558dd02
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7d9feac
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
c956101
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java
5153885
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapInput.java 3fd37a0
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java
126cb9f
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 3773dcb
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java
d7744e9
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 280edde
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java ac94ea0
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 644c681
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
1d01040
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
93940bc
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
20eb344
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
a62643a
ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 05be1f1
ql/src/test/queries/clientpositive/multi_insert_mixed.q PRE-CREATION
ql/src/test/results/clientpositive/multi_insert_mixed.q.out PRE-CREATION
ql/src/test/results/clientpositive/spark/groupby7_map.q.out 310f2fe
ql/src/test/results/clientpositive/spark/groupby7_map_skew.q.out e6054c9
ql/src/test/results/clientpositive/spark/groupby7_noskew.q.out d0f3e76
ql/src/test/results/clientpositive/spark/groupby_cube1.q.out d40c7bb
ql/src/test/results/clientpositive/spark/groupby_multi_single_reducer.q.out
b4ded62
ql/src/test/results/clientpositive/spark/groupby_position.q.out d2529bb
ql/src/test/results/clientpositive/spark/groupby_rollup1.q.out 7fa6130
ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 4a4070b
ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 62c179e
ql/src/test/results/clientpositive/spark/input12.q.out a4b7a3c
ql/src/test/results/clientpositive/spark/input13.q.out 5c799dc
ql/src/test/results/clientpositive/spark/input1_limit.q.out 1105ed8
ql/src/test/results/clientpositive/spark/input_part2.q.out 514f54a
ql/src/test/results/clientpositive/spark/insert1.q.out 1b88026
ql/src/test/results/clientpositive/spark/insert_into3.q.out 5b2aa78
ql/src/test/results/clientpositive/spark/load_dyn_part1.q.out cbf7204
ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out 3905d84
ql/src/test/results/clientpositive/spark/multi_insert.q.out 0404119
ql/src/test/results/clientpositive/spark/multi_insert_gby3.q.out 903e966
ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out
730fb4f
ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out
PRE-CREATION
ql/src/test/results/clientpositive/spark/multi_insert_move_tasks_share_dependencies.q.out
1f31f56
ql/src/test/results/clientpositive/spark/multigroupby_singlemr.q.out 4ded9d2
ql/src/test/results/clientpositive/spark/ppd_multi_insert.q.out 2b63321
ql/src/test/results/clientpositive/spark/ppd_transform.q.out 16bfac1
ql/src/test/results/clientpositive/spark/subquery_multiinsert.q.out 05d719a
ql/src/test/results/clientpositive/spark/union18.q.out ce3e20c
ql/src/test/results/clientpositive/spark/union19.q.out ac28e36
ql/src/test/results/clientpositive/spark/union_remove_6.q.out 1836150
ql/src/test/results/clientpositive/spark/vectorized_ptf.q.out 179edd1
Diff: https://reviews.apache.org/r/26706/diff/
Testing
-------
All multi-insertion related results are regenerated, and manually checked
against the old results.
Also I created a new test "spark_multi_insert_spill_work.q" to check splitting
won't generate duplicate FSs.
Thanks,
Chao Sun