[ 
https://issues.apache.org/jira/browse/TEZ-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996676#comment-13996676
 ] 

Bikas Saha commented on TEZ-1088:
---------------------------------

Can you please open a new jira for this and provide the above analysis in that 
jira. I will commit this patch and close this jira.

> Flaky Test: 
> TestFaultTolerance.testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess
> ----------------------------------------------------------------------------------------
>
>                 Key: TEZ-1088
>                 URL: https://issues.apache.org/jira/browse/TEZ-1088
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>            Assignee: Tassapol Athiapinya
>         Attachments: TEZ-1088.1.patch, TEZ-1088.2.patch, 
> org.apache.tez.test.TestFaultTolerance.zip, syslog_dag_1398715972246_0001_9
>
>
> 2014-04-28 20:14:19,100 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.history.recovery.RecoveryService: DAG completed, 
> dagId=dag_1398715972246_0001_9, queueSize=0
> 2014-04-28 20:14:19,147 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.history.HistoryEventHandler: 
> [HISTORY][DAG:dag_1398715972246_0001_9][Event:DAG_FINISHED]: 
> dagId=dag_1398715972246_0001_9, startTime=1398716046982, 
> finishTime=1398716059090, timeTaken=12108, status=FAILED, diagnostics=Vertex 
> re-running, vertexName=v1, vertexId=vertex_1398715972246_0001_9_00
> Vertex failed, vertexName=v2, vertexId=vertex_1398715972246_0001_9_01, 
> diagnostics=[Task failed, taskId=task_1398715972246_0001_9_01_000001, 
> diagnostics=[AttemptID:attempt_1398715972246_0001_9_01_000001_0 Info:Error: 
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of 
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_0_10001 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 0
> Expected output: 4 got: 5
>       at 
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
>       at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor: 
> attempt_1398715972246_0001_9_01_000001_0_10001 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 0
> Expected output: 4 got: 5
> Container killed by the ApplicationMaster.
> Container killed on request. Exit code is 143
> , AttemptID:attempt_1398715972246_0001_9_01_000001_1 Info:Error: 
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of 
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_1_10011 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 1
> Expected output: 4 got: 5
>       at 
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
>       at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor: 
> attempt_1398715972246_0001_9_01_000001_1_10011 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 1
> Expected output: 4 got: 5
> Container released by application, 
> AttemptID:attempt_1398715972246_0001_9_01_000001_2 Info:Error: 
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of 
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_2_10003 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 2
> Expected output: 4 got: 6
>       at 
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
>       at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor: 
> attempt_1398715972246_0001_9_01_000001_2_10003 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 2
> Expected output: 4 got: 6
> Container released by application, 
> AttemptID:attempt_1398715972246_0001_9_01_000001_3 Info:Error: 
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of 
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_3_10001 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 3
> Expected output: 4 got: 8
>       at 
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
>       at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>       at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor: 
> attempt_1398715972246_0001_9_01_000001_3_10001 dag: 
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2 
> taskIndex: 1 taskAttempt: 3
> Expected output: 4 got: 8], Vertex failed as one or more tasks failed. 
> failedTasks:1]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0, 
> counters=Counters: 12, org.apache.tez.common.counters.DAGCounter, 
> NUM_FAILED_TASKS=6, TOTAL_LAUNCHED_TASKS=9, File System Counters, HDFS: 
> BYTES_READ=0, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=0, HDFS: 
> LARGE_READ_OPS=0, HDFS: WRITE_OPS=0, 
> org.apache.tez.common.counters.TaskCounter, GC_TIME_MILLIS=21, 
> CPU_MILLISECONDS=-12330, PHYSICAL_MEMORY_BYTES=548294656, 
> VIRTUAL_MEMORY_BYTES=4034506752, COMMITTED_HEAP_BYTES=310968320
> 2014-04-28 20:14:19,147 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.DAGImpl: DAG: dag_1398715972246_0001_9 
> finished with state: FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.dag.impl.DAGImpl: dag_1398715972246_0001_9 
> transitioned from RUNNING to FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler] 
> org.apache.tez.dag.app.DAGAppMaster: DAG completed, 
> dagId=dag_1398715972246_0001_9, dagState=FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler] 
> org.apache.tez.common.TezUtils: Redirecting log files based on addend: 
> dag_1398715972246_0001_9_post
> 2014-04-28 20:14:19,148 INFO [ContainerLauncher #4] 
> org.apache.tez.dag.app.launcher.ContainerLauncherImpl: Processing the event 
> EventType: CONTAINER_STOP_REQUEST



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to