> On April 21, 2015, 2:17 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java, line > > 120 > > <https://reviews.apache.org/r/33251/diff/2/?file=937278#file937278line120> > > > > 1. For clarity, it might be good to put this in a separate private > > method. > > 2. Does it work if we just synchronize on mapJoinTables[pos]?
1. Have put it in a separate method. 2. Probabl it does work to synchronize on mapJoinTables[pos] since it is not the same object in different tasks. > On April 21, 2015, 2:17 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java, > > line 52 > > <https://reviews.apache.org/r/33251/diff/2/?file=937280#file937280line52> > > > > Method naming, see below. Fixed. Changed the method a little so that it doesn't do more than clean up small table caches. > On April 21, 2015, 2:17 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line > > 51 > > <https://reviews.apache.org/r/33251/diff/2/?file=937281#file937281line51> > > > > Using thread-local makes me a little nervous, but let's discuss about > > this offline. Per our discussion, we don't use thread-local any more. > On April 21, 2015, 2:17 p.m., Xuefu Zhang wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line > > 60 > > <https://reviews.apache.org/r/33251/diff/2/?file=937281#file937281line60> > > > > The method name suggests no indication of a side effect of setting > > thread local value. We'd better put this outside of this method. > > > > In addition, the method name seems also a little confusing in that it > > suggests cleanup is for sure but in fact it's conditional. Removed the thread local variable usage. As to the method name, it means we clean up the cache a little, it doesn't mean we remove all the cahced contents. I added some javadoc to make it a little clear. - Jimmy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33251/#review80969 ----------------------------------------------------------- On April 21, 2015, 1:37 a.m., Jimmy Xiang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/33251/ > ----------------------------------------------------------- > > (Updated April 21, 2015, 1:37 a.m.) > > > Review request for hive, Chao Sun, Szehon Ho, and Xuefu Zhang. > > > Bugs: HIVE-10302 > https://issues.apache.org/jira/browse/HIVE-10302 > > > Repository: hive-git > > > Description > ------- > > Cached the small table containter so that mapjoin tasks can use it if the > task is executed on the same Spark executor. > The cache is released right before the next job after the mapjoin job is done. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java > fe108c4 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java > 3f240f5 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java > 97b3471 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java > 72ab913 > > Diff: https://reviews.apache.org/r/33251/diff/ > > > Testing > ------- > > Ran several queries in live cluster. ptest pending. > > > Thanks, > > Jimmy Xiang > >