----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26706/ -----------------------------------------------------------
(Updated Oct. 20, 2014, 9:10 p.m.) Review request for hive and Xuefu Zhang. Changes ------- Changed test file name and comments, also rebased to the latest update. Bugs: HIVE-8436 https://issues.apache.org/jira/browse/HIVE-8436 Repository: hive-git Description ------- Based on the design doc, we need to split the operator tree of a work in SparkWork if the work is connected to multiple child works. The way splitting the operator tree is performed by cloning the original work and removing unwanted branches in the operator tree. Please refer to the design doc for details. This process should be done right before we generate SparkPlan. We should have a utility method that takes the orignal SparkWork and return a modified SparkWork. This process should also keep the information about the original work and its clones. Such information will be needed during SparkPlan generation (HIVE-8437). Diffs (updated) ----- itests/src/test/resources/testconfiguration.properties 558dd02 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7d9feac ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java c956101 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 5153885 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapInput.java 3fd37a0 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 126cb9f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 3773dcb ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java d7744e9 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 280edde ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java ac94ea0 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 644c681 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java 1d01040 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java 93940bc ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java 20eb344 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java a62643a ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 05be1f1 ql/src/test/queries/clientpositive/multi_insert_mixed.q PRE-CREATION ql/src/test/results/clientpositive/multi_insert_mixed.q.out PRE-CREATION ql/src/test/results/clientpositive/spark/groupby7_map.q.out 310f2fe ql/src/test/results/clientpositive/spark/groupby7_map_skew.q.out e6054c9 ql/src/test/results/clientpositive/spark/groupby7_noskew.q.out d0f3e76 ql/src/test/results/clientpositive/spark/groupby_cube1.q.out d40c7bb ql/src/test/results/clientpositive/spark/groupby_multi_single_reducer.q.out b4ded62 ql/src/test/results/clientpositive/spark/groupby_position.q.out d2529bb ql/src/test/results/clientpositive/spark/groupby_rollup1.q.out 7fa6130 ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 4a4070b ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 62c179e ql/src/test/results/clientpositive/spark/input12.q.out a4b7a3c ql/src/test/results/clientpositive/spark/input13.q.out 5c799dc ql/src/test/results/clientpositive/spark/input1_limit.q.out 1105ed8 ql/src/test/results/clientpositive/spark/input_part2.q.out 514f54a ql/src/test/results/clientpositive/spark/insert1.q.out 1b88026 ql/src/test/results/clientpositive/spark/insert_into3.q.out 5b2aa78 ql/src/test/results/clientpositive/spark/load_dyn_part1.q.out cbf7204 ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out 3905d84 ql/src/test/results/clientpositive/spark/multi_insert.q.out 0404119 ql/src/test/results/clientpositive/spark/multi_insert_gby3.q.out 903e966 ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 730fb4f ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out PRE-CREATION ql/src/test/results/clientpositive/spark/multi_insert_move_tasks_share_dependencies.q.out 1f31f56 ql/src/test/results/clientpositive/spark/multigroupby_singlemr.q.out 4ded9d2 ql/src/test/results/clientpositive/spark/ppd_multi_insert.q.out 2b63321 ql/src/test/results/clientpositive/spark/ppd_transform.q.out 16bfac1 ql/src/test/results/clientpositive/spark/subquery_multiinsert.q.out 05d719a ql/src/test/results/clientpositive/spark/union18.q.out ce3e20c ql/src/test/results/clientpositive/spark/union19.q.out ac28e36 ql/src/test/results/clientpositive/spark/union_remove_6.q.out 1836150 ql/src/test/results/clientpositive/spark/vectorized_ptf.q.out 179edd1 Diff: https://reviews.apache.org/r/26706/diff/ Testing ------- All multi-insertion related results are regenerated, and manually checked against the old results. Also I created a new test "spark_multi_insert_spill_work.q" to check splitting won't generate duplicate FSs. Thanks, Chao Sun