----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33251/ -----------------------------------------------------------
(Updated April 22, 2015, 4:36 p.m.) Review request for hive, Chao Sun, Szehon Ho, and Xuefu Zhang. Changes ------- Addressed Xuefu's review comments: removed threadlocal variable, added some javadoc, fixed some code clarification issue. In this patch, we still clean up cache based on work id so that we can avoid extra memory usage for other works in the same job. Unfortunately, this means, if there are other works running in parallel with the mapjoin work, the cache may be released when it can still be kept for a while. Bugs: HIVE-10302 https://issues.apache.org/jira/browse/HIVE-10302 Repository: hive-git Description ------- Cached the small table containter so that mapjoin tasks can use it if the task is executed on the same Spark executor. The cache is released right before the next job after the mapjoin job is done. Diffs (updated) ----- ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java fe108c4 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HivePairFlatMapFunction.java 2f137f9 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 3f240f5 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 72ab913 Diff: https://reviews.apache.org/r/33251/diff/ Testing ------- Ran several queries in live cluster. ptest pending. Thanks, Jimmy Xiang