> On Nov. 6, 2014, 7:01 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java, line 86 > > <https://reviews.apache.org/r/27687/diff/1/?file=751768#file751768line86> > > > > Can we document what are in the tuple, especially what each means?
Sure. Will add a doc. > On Nov. 6, 2014, 7:01 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java, line 75 > > <https://reviews.apache.org/r/27687/diff/1/?file=751768#file751768line75> > > > > I don't feel we need to cache this, as this can change during a user > > session. Yes, it will change during a user session. I was thinking to update this when things are changed base on some event callbacks. Such info may be needed many times if there are many reducers. It should save us some time to go to the Spark master (assuming getExecutorMemoryStatus checking with the master). > On Nov. 6, 2014, 7:01 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java, line 89 > > <https://reviews.apache.org/r/27687/diff/1/?file=751768#file751768line89> > > > > I'm not sure why this needs to be synchronized. Will this method be > > called by concurrent threads? It doesn't seem to be the case. Are you saying it won't be called by many threads? Each JVM can run one query at a time during all deployment modes? How come SparkClient.getInstance is synchronized? - Jimmy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27687/#review60210 ----------------------------------------------------------- On Nov. 6, 2014, 5:25 p.m., Jimmy Xiang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/27687/ > ----------------------------------------------------------- > > (Updated Nov. 6, 2014, 5:25 p.m.) > > > Review request for hive and Xuefu Zhang. > > > Bugs: HIVE-8649 > https://issues.apache.org/jira/browse/HIVE-8649 > > > Repository: hive-git > > > Description > ------- > > First patch for HIVE-8649, to increase the number of reducers for spark based > on some info about the spark cluster. > We need to add a SparkListener to handle cluster status change if such events > are supported by spark. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkClient.java 5766787 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java > 2dbb5a3 > > Diff: https://reviews.apache.org/r/27687/diff/ > > > Testing > ------- > > > Thanks, > > Jimmy Xiang > >