Gopal V created TEZ-4073:
----------------------------

             Summary: Configuration: Reduce Vertex and DAG Payload Size
                 Key: TEZ-4073
                 URL: https://issues.apache.org/jira/browse/TEZ-4073
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Gopal V


As the total number of vertices go up, the Tez protobuf transport starts to 
show up as a potential scalability problem for the task submission and the AM

{code}
public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, String[] 
localDirs,
 ...
    this.taskConf = new Configuration(tezConf);
    if (taskSpec.getTaskConf() != null) {
      Iterator<Entry<String, String>> iter = taskSpec.getTaskConf().iterator();
      while (iter.hasNext()) {
        Entry<String, String> entry = iter.next();
        taskConf.set(entry.getKey(), entry.getValue());
      }
    }
{code}

The TaskSpec getTaskConf() need not include any of the default configs, since 
the keys are placed into an existing task conf.

{code}
    // Security framework already loaded the tokens into current ugi
    DAGProtos.ConfigurationProto confProto =
        
TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name()));
    TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, 
confProto.getConfKeyValuesList());
    UserGroupInformation.setConfiguration(defaultConf);
    Credentials credentials = 
UserGroupInformation.getCurrentUser().getCredentials();
{code}

At the very least, the DAG and Vertex do not both need to have the same configs 
repeated in them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to