[
https://issues.apache.org/jira/browse/TEZ-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996676#comment-13996676
]
Bikas Saha commented on TEZ-1088:
---------------------------------
Can you please open a new jira for this and provide the above analysis in that
jira. I will commit this patch and close this jira.
> Flaky Test:
> TestFaultTolerance.testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess
> ----------------------------------------------------------------------------------------
>
> Key: TEZ-1088
> URL: https://issues.apache.org/jira/browse/TEZ-1088
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Hitesh Shah
> Assignee: Tassapol Athiapinya
> Attachments: TEZ-1088.1.patch, TEZ-1088.2.patch,
> org.apache.tez.test.TestFaultTolerance.zip, syslog_dag_1398715972246_0001_9
>
>
> 2014-04-28 20:14:19,100 INFO [AsyncDispatcher event handler]
> org.apache.tez.dag.history.recovery.RecoveryService: DAG completed,
> dagId=dag_1398715972246_0001_9, queueSize=0
> 2014-04-28 20:14:19,147 INFO [AsyncDispatcher event handler]
> org.apache.tez.dag.history.HistoryEventHandler:
> [HISTORY][DAG:dag_1398715972246_0001_9][Event:DAG_FINISHED]:
> dagId=dag_1398715972246_0001_9, startTime=1398716046982,
> finishTime=1398716059090, timeTaken=12108, status=FAILED, diagnostics=Vertex
> re-running, vertexName=v1, vertexId=vertex_1398715972246_0001_9_00
> Vertex failed, vertexName=v2, vertexId=vertex_1398715972246_0001_9_01,
> diagnostics=[Task failed, taskId=task_1398715972246_0001_9_01_000001,
> diagnostics=[AttemptID:attempt_1398715972246_0001_9_01_000001_0 Info:Error:
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_0_10001 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 0
> Expected output: 4 got: 5
> at
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
> at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor:
> attempt_1398715972246_0001_9_01_000001_0_10001 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 0
> Expected output: 4 got: 5
> Container killed by the ApplicationMaster.
> Container killed on request. Exit code is 143
> , AttemptID:attempt_1398715972246_0001_9_01_000001_1 Info:Error:
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_1_10011 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 1
> Expected output: 4 got: 5
> at
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
> at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor:
> attempt_1398715972246_0001_9_01_000001_1_10011 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 1
> Expected output: 4 got: 5
> Container released by application,
> AttemptID:attempt_1398715972246_0001_9_01_000001_2 Info:Error:
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_2_10003 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 2
> Expected output: 4 got: 6
> at
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
> at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor:
> attempt_1398715972246_0001_9_01_000001_2_10003 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 2
> Expected output: 4 got: 6
> Container released by application,
> AttemptID:attempt_1398715972246_0001_9_01_000001_3 Info:Error:
> exceptionThrown=java.lang.RuntimeException: Expected output mismatch of
> current FailingProcessor: attempt_1398715972246_0001_9_01_000001_3_10001 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 3
> Expected output: 4 got: 8
> at
> org.apache.tez.test.TestProcessor.throwException(TestProcessor.java:98)
> at org.apache.tez.test.TestProcessor.run(TestProcessor.java:250)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:581)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:570)
> , errorMessage=Expected output mismatch of current FailingProcessor:
> attempt_1398715972246_0001_9_01_000001_3_10001 dag:
> testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess vertex: v2
> taskIndex: 1 taskAttempt: 3
> Expected output: 4 got: 8], Vertex failed as one or more tasks failed.
> failedTasks:1]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0,
> counters=Counters: 12, org.apache.tez.common.counters.DAGCounter,
> NUM_FAILED_TASKS=6, TOTAL_LAUNCHED_TASKS=9, File System Counters, HDFS:
> BYTES_READ=0, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=0, HDFS:
> LARGE_READ_OPS=0, HDFS: WRITE_OPS=0,
> org.apache.tez.common.counters.TaskCounter, GC_TIME_MILLIS=21,
> CPU_MILLISECONDS=-12330, PHYSICAL_MEMORY_BYTES=548294656,
> VIRTUAL_MEMORY_BYTES=4034506752, COMMITTED_HEAP_BYTES=310968320
> 2014-04-28 20:14:19,147 INFO [AsyncDispatcher event handler]
> org.apache.tez.dag.app.dag.impl.DAGImpl: DAG: dag_1398715972246_0001_9
> finished with state: FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler]
> org.apache.tez.dag.app.dag.impl.DAGImpl: dag_1398715972246_0001_9
> transitioned from RUNNING to FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler]
> org.apache.tez.dag.app.DAGAppMaster: DAG completed,
> dagId=dag_1398715972246_0001_9, dagState=FAILED
> 2014-04-28 20:14:19,148 INFO [AsyncDispatcher event handler]
> org.apache.tez.common.TezUtils: Redirecting log files based on addend:
> dag_1398715972246_0001_9_post
> 2014-04-28 20:14:19,148 INFO [ContainerLauncher #4]
> org.apache.tez.dag.app.launcher.ContainerLauncherImpl: Processing the event
> EventType: CONTAINER_STOP_REQUEST
--
This message was sent by Atlassian JIRA
(v6.2#6252)