[ 
https://issues.apache.org/jira/browse/HIVE-8536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224299#comment-14224299
 ] 

Rui Li commented on HIVE-8536:
------------------------------

It seems the dependency task is created in {{GenSparkProcContext}}. And we 
always add it for move task. I suspect this is unnecessary. Here's how MR 
decides whether to create dependency task in {{GenMRProcContext}}:
{code}
public DependencyCollectionTask getDependencyTaskForMultiInsert() {
    if (dependencyTaskForMultiInsert == null) {
      if 
(conf.getBoolVar(ConfVars.HIVE_MULTI_INSERT_MOVE_TASKS_SHARE_DEPENDENCIES)) {
        dependencyTaskForMultiInsert =
            (DependencyCollectionTask) TaskFactory.get(new 
DependencyCollectionWork(), conf);
      }
    }
    return dependencyTaskForMultiInsert;
  }
{code}

I'm writing a patch to do it the MR way and run some tests. If all diff is in 
query plan, we can open another JIRA to fix it.

[~csun] do you have any comments on this since it seems related to multi-insert?

> Enable SkewJoinResolver for spark [Spark Branch]
> ------------------------------------------------
>
>                 Key: HIVE-8536
>                 URL: https://issues.apache.org/jira/browse/HIVE-8536
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-8536.1-spark.patch, HIVE-8536.2-spark.patch
>
>
> Sub-task of HIVE-8406



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to