> On Sept. 20, 2014, 1:03 a.m., Brock Noland wrote: > > Awesome work!!!! I have a few minor comments that can be addressed in a > > *follow on* patch.
Thanks brock for the comments! I've attached the updated patch. > On Sept. 20, 2014, 1:03 a.m., Brock Noland wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java, > > line 92 > > <https://reviews.apache.org/r/25394/diff/4/?file=698349#file698349line92> > > > > it sounds like we'll be creating a multi-insert specific context? In > > that context can we make all the members private? Yes, I'll do that in the following patch. - Chao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25394/#review54065 ----------------------------------------------------------- On Sept. 20, 2014, 1:33 a.m., Chao Sun wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/25394/ > ----------------------------------------------------------- > > (Updated Sept. 20, 2014, 1:33 a.m.) > > > Review request for hive, Brock Noland and Xuefu Zhang. > > > Bugs: HIVE-7503 > https://issues.apache.org/jira/browse/HIVE-7503 > > > Repository: hive-git > > > Description > ------- > > For Hive's multi insert query > (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there > may be an MR job for each insert. When we achieve this with Spark, it would > be nice if all the inserts can happen concurrently. > It seems that this functionality isn't available in Spark. To make things > worse, the source of the insert may be re-computed unless it's staged. Even > with this, the inserts will happen sequentially, making the performance > suffer. > This task is to find out what takes in Spark to enable this without requiring > staging the source and sequential insertion. If this has to be solved in > Hive, find out an optimum way to do this. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java > 4211a07 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > 695d8b9 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 864965e > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java > 76fc290 > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java > 5fcaf64 > > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java > PRE-CREATION > ql/src/test/results/clientpositive/spark/insert1.q.out 49fb1d4 > ql/src/test/results/clientpositive/spark/union18.q.out 9a40807 > ql/src/test/results/clientpositive/spark/union19.q.out 131591f > ql/src/test/results/clientpositive/spark/union_remove_6.q.out 1bc55f4 > > Diff: https://reviews.apache.org/r/25394/diff/ > > > Testing > ------- > > > Thanks, > > Chao Sun > >