[jira] [Commented] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436537#comment-16436537 ] TezQA commented on TEZ-3914: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12918835/TEZ-3914.002.patch against master revision 871ea80. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestRecovery org.apache.tez.test.TestDAGRecovery org.apache.tez.test.TestAMRecovery org.apache.tez.runtime.library.common.sort.impl.dflt.TestDefaultSorter org.apache.tez.dag.history.events.TestHistoryEventsProtoConversion org.apache.tez.dag.app.TestRecoveryParser org.apache.tez.dag.app.dag.impl.TestDAGRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2756//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2756//console This message is automatically generated. > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: TEZ-3914 PreCommit Build #2756
Jira: https://issues.apache.org/jira/browse/TEZ-3914 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2756/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 348.19 KB...] [ERROR] mvn -rf :tez-runtime-library [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12918835/TEZ-3914.002.patch against master revision 871ea80. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestRecovery org.apache.tez.test.TestDAGRecovery org.apache.tez.test.TestAMRecovery org.apache.tez.runtime.library.common.sort.impl.dflt.TestDefaultSorter org.apache.tez.dag.history.events.TestHistoryEventsProtoConversion org.apache.tez.dag.app.TestRecoveryParser org.apache.tez.dag.app.dag.impl.TestDAGRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2756//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2756//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 28 tests failed. FAILED: org.apache.tez.dag.app.TestRecoveryParser.testRecoveryData Error Message: GC overhead limit exceeded Stack Trace: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149) at java.lang.StringCoding.decode(StringCoding.java:193) at java.lang.String.(String.java:426) at com.google.protobuf.LiteralByteString.toString(LiteralByteString.java:148) at com.google.protobuf.ByteString.toStringUtf8(ByteString.java:572) at com.google.protobuf.LazyStringArrayList.get(LazyStringArrayList.java:92) at com.google.protobuf.LazyStringArrayList.get(LazyStringArrayList.java:64) at java.util.AbstractList$Itr.next(AbstractList.java:358) at com.google.protobuf.UnmodifiableLazyStringList$2.next(UnmodifiableLazyStringList.java:138) at com.google.protobuf.UnmodifiableLazyStringList$2.next(UnmodifiableLazyStringList.java:128) at java.util.AbstractCollection.addAll(AbstractCollection.java:343) at java.util.HashSet.(HashSet.java:120) at org.apache.tez.dag.api.DagTypeConverters.convertVertexLocationHintFromProto(DagTypeConverters.java:700) at org.apache.tez.dag.history.events.VertexConfigurationDoneEvent.fromProto(VertexConfigurationDoneEvent.java:126) at org.apache.tez.dag.history.events.VertexConfigurationDoneEvent.fromProtoStream(VertexConfigurationDoneEvent.java:168) at org.apache.tez.dag.app.RecoveryParser.getNextEvent(RecoveryParser.java:342) at org.apache.tez.dag.app.RecoveryParser.parseRecoveryData(RecoveryParser.java:756) at org.apache.tez.dag.app.TestRecoveryParser.testRecoveryData(TestRecoveryParser.java:727) FAILED: org.apache.tez.dag.app.TestRecoveryParser.testLastCorruptedRecoveryRecord Error Message: null Stack Trace: java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at
[jira] [Updated] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3914: - Attachment: TEZ-3914.002.patch > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch, TEZ-3914.002.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436385#comment-16436385 ] TezQA commented on TEZ-3914: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12918828/TEZ-3914.001.patch against master revision 871ea80. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2755//console This message is automatically generated. > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: TEZ-3914 PreCommit Build #2755
Jira: https://issues.apache.org/jira/browse/TEZ-3914 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2755/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 6.90 KB...] patching file tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexGroupCommitStartedEvent.java patching file tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexInitializedEvent.java patching file tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexStartedEvent.java patching file tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java patching file tez-dag/src/test/java/org/apache/tez/dag/app/TestRecoveryParser.java patching file tez-dag/src/test/java/org/apache/tez/dag/history/events/TestHistoryEventsProtoConversion.java patching file tez-tests/src/test/java/org/apache/tez/test/RecoveryServiceWithEventHandlingHook.java == == Determining number of patched javac warnings. == == /home/jenkins/tools/maven/latest/bin/mvn clean test -DskipTests -Ptest-patch > /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/patchJavacWarnings.txt 2>&1 {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12918828/TEZ-3914.001.patch against master revision 871ea80. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2755//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results ERROR: Step ?Publish JUnit test result report? failed: No test report files were found. Configuration error? Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Updated] (TEZ-3914) Recovering a large DAG hang job
[ https://issues.apache.org/jira/browse/TEZ-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3914: - Attachment: TEZ-3914.001.patch > Recovering a large DAG hang job > --- > > Key: TEZ-3914 > URL: https://issues.apache.org/jira/browse/TEZ-3914 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3914.001.patch > > > Any failure to parse recovery event is ignore and treated as eof. Job can > hang since some task completions may be missed and shuffle will hang. -- This message was sent by Atlassian JIRA (v7.6.3#76005)