[ https://issues.apache.org/jira/browse/TEZ-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152996#comment-14152996 ]
Jeff Zhang commented on TEZ-1631: --------------------------------- Simulate the case in single node cluster ( using UnionExample which has vertexGroup so that it would modify it when converting to DAGPlan), it will print the following message if submit the same DAG after AM is session timeout {code} org.apache.tez.dag.api.SessionNotRunning: Application not running, applicationId=application_1412047114698_0012, yarnApplicationState=FINISHED, finalApplicationStatus=SUCCEEDED, trackingUrl=http://jzhangMBPr.local:8088/proxy/application_1412047114698_0012/A at org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:733) at org.apache.tez.client.TezSession.stop(TezSession.java:270) at org.apache.tez.mapreduce.examples.UnionExample.run(UnionExample.java:503) at org.apache.tez.mapreduce.examples.UnionExample.main(UnionExample.java:513) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 14/09/30 17:10:19 INFO client.TezSession: Could not connect to AM, killing session via YARN, sessionName=UnionExampleSession, applicationId=application_1412047114698_0012 14/09/30 17:10:19 INFO impl.YarnClientImpl: Killed application application_1412047114698_0012 org.apache.tez.dag.api.SessionNotRunning: Application not running, applicationId=application_1412047114698_0012, yarnApplicationState=FINISHED, finalApplicationStatus=SUCCEEDED, trackingUrl=http://jzhangMBPr.local:8088/proxy/application_1412047114698_0012/A at org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:733) at org.apache.tez.client.TezSession.waitForProxy(TezSession.java:400) at org.apache.tez.client.TezSession.submitDAG(TezSession.java:197) at org.apache.tez.client.TezSession.submitDAG(TezSession.java:162) at org.apache.tez.mapreduce.examples.UnionExample.run(UnionExample.java:499) at org.apache.tez.mapreduce.examples.UnionExample.main(UnionExample.java:513) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} > Session dag submission timeout can result in duplicate DAG submissions > ---------------------------------------------------------------------- > > Key: TEZ-1631 > URL: https://issues.apache.org/jira/browse/TEZ-1631 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.4.1 > Reporter: Bikas Saha > Assignee: Jeff Zhang > Attachments: Tez-1631.patch > > > In TezSession.submitDAG() we could first check if the session is ready and > throw a SessionNotRunning exception if that is not the case. This should be > done before processing the DAG and thus will prevent unnecessary modification > of the DAG. > If the session is ready then we can submit the DAG as usual. Higher level > components already handle SessionNotRunning exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)