[ https://issues.apache.org/jira/browse/HIVE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258741#comment-14258741 ]
Rui Li commented on HIVE-9135: ------------------------------ I'm not sure if this is correct: we clone JobConf in {{SparkPalnGenerator.cloneJobConf}} and sets a different plan path for each BaseWork. These BaseWorks shouldn't be cached because each task needs to have its own BaseWork. Currently, when we sets different plan path, we just wipes out the original value and relies on Utilities to set a random one for us: {code} // Make sure we'll use a different plan path from the original one HiveConf.setVar(cloned, HiveConf.ConfVars.PLAN, ""); {code} Maybe we could set our own plan path with some special pre/postfix so Utilities can tell which BaseWork should be cached and which should not. > Cache Map and Reduce works in RSC [Spark Branch] > ------------------------------------------------ > > Key: HIVE-9135 > URL: https://issues.apache.org/jira/browse/HIVE-9135 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Brock Noland > Assignee: Jimmy Xiang > Attachments: HIVE-9135.1-spark.patch, HIVE-9135.1-spark.patch > > > HIVE-9127 works around the fact that we don't cache Map/Reduce works in > Spark. However, other input formats such as HiveInputFormat will not benefit > from that fix. We should investigate how to allow caching on the RSC while > not on tasks (see HIVE-7431). -- This message was sent by Atlassian JIRA (v6.3.4#6332)