[ https://issues.apache.org/jira/browse/TEZ-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128030#comment-14128030 ]
Josh Elser commented on TEZ-1563: --------------------------------- Found this when trying to work through HIVE-7950. Some related discussion is over there for the interested. > TezClient.submitDAGSession alters DAG local resources regardless of DAG > submission > ---------------------------------------------------------------------------------- > > Key: TEZ-1563 > URL: https://issues.apache.org/jira/browse/TEZ-1563 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.5.0 > Reporter: Josh Elser > > In {{TezClient#submitDAGSesssion(Dag)}}, a {{DAGPlan}} is created from the > {{DAG}} before the {{DAGClientAMProtocolBlockingPB}} is instantiated. When > the application isn't running, {{waitForProxy()}} will throw a > {{SessionNotRunning}} Exception. > The problem is that the internal state of the {{DAG}} is modified, regardless > of whether the DAG is actually run or not. > {code} > DAGPlan dagPlan = dag.createDag(amConfig.getTezConfiguration()); > {code} > The {{createDag}} method will ultimately call {{addTaskLocalFiles}} for each > {{Vertex}} in the {{DAG}} > {code} > // add common task files for this DAG > vertex.addTaskLocalFiles(commonTaskLocalFiles); > {code} > Because the {{DAG}}'s state is modified, {{Vertex#addTaskLocalFiles(Map)}} > will fail if any resources are added multiple times. As such, if the > application is not running and {{SessionNotRunning}} is thrown, that same DAG > cannot be passed in to run the DAG after the application is started again. > Additionally, {{DAG}} is missing a getTaskLocalFiles method as compared to > {{Vertex}} which would be good to add to make the two classes more uniform. -- This message was sent by Atlassian JIRA (v6.3.4#6332)