[ 
https://issues.apache.org/jira/browse/HIVE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258741#comment-14258741
 ] 

Rui Li commented on HIVE-9135:
------------------------------

I'm not sure if this is correct: we clone JobConf in 
{{SparkPalnGenerator.cloneJobConf}} and sets a different plan path for each 
BaseWork. These BaseWorks shouldn't be cached because each task needs to have 
its own BaseWork. Currently, when we sets different plan path, we just wipes 
out the original value and relies on Utilities to set a random one for us:
{code}
    // Make sure we'll use a different plan path from the original one
    HiveConf.setVar(cloned, HiveConf.ConfVars.PLAN, "");
{code}
Maybe we could set our own plan path with some special pre/postfix so Utilities 
can tell which BaseWork should be cached and which should not.

> Cache Map and Reduce works in RSC [Spark Branch]
> ------------------------------------------------
>
>                 Key: HIVE-9135
>                 URL: https://issues.apache.org/jira/browse/HIVE-9135
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Brock Noland
>            Assignee: Jimmy Xiang
>         Attachments: HIVE-9135.1-spark.patch, HIVE-9135.1-spark.patch
>
>
> HIVE-9127 works around the fact that we don't cache Map/Reduce works in 
> Spark. However, other input formats such as HiveInputFormat will not benefit 
> from that fix. We should investigate how to allow caching on the RSC while 
> not on tasks (see HIVE-7431).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to