[jira] [Commented] (TEZ-3700) Consumer attempt should kill itself instead of failing during validation checks with final merge avoidance
[ https://issues.apache.org/jira/browse/TEZ-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989755#comment-15989755 ] TezQA commented on TEZ-3700: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865655/TEZ-3700.4.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2405//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2405//console This message is automatically generated. > Consumer attempt should kill itself instead of failing during validation > checks with final merge avoidance > -- > > Key: TEZ-3700 > URL: https://issues.apache.org/jira/browse/TEZ-3700 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3700.1.patch, TEZ-3700.2.patch, TEZ-3700.3.patch, > TEZ-3700.4.patch > > > Currently when if data is received from different attempts with final merge > disabled (with/without pipleining), consumer attempt ends up with failure. > Instead it should issue kill request so that the job deos not end up with > failures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Success: TEZ-3700 PreCommit Build #2405
Jira: https://issues.apache.org/jira/browse/TEZ-3700 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2405/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 329.94 KB...] [INFO] Tez SUCCESS [ 0.040 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 52:59 min [INFO] Finished at: 2017-04-29T04:19:07+00:00 [INFO] Final Memory: 90M/1433M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865655/TEZ-3700.4.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2405//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2405//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 543f210e5bc55e7659e468a571504a6cf3fc7983 logged out == == Finished build. == == Archiving artifacts [description-setter] Description set: TEZ-3700 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-3700) Consumer attempt should kill itself instead of failing during validation checks with final merge avoidance
[ https://issues.apache.org/jira/browse/TEZ-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-3700: -- Attachment: TEZ-3700.4.patch Thanks [~harishjp] for the review. Attaching the patch with review comments addressed. Items in failedTaskIds need not be removed. This would be cleared up when the DAG is complete. > Consumer attempt should kill itself instead of failing during validation > checks with final merge avoidance > -- > > Key: TEZ-3700 > URL: https://issues.apache.org/jira/browse/TEZ-3700 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3700.1.patch, TEZ-3700.2.patch, TEZ-3700.3.patch, > TEZ-3700.4.patch > > > Currently when if data is received from different attempts with final merge > disabled (with/without pipleining), consumer attempt ends up with failure. > Instead it should issue kill request so that the job deos not end up with > failures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3700) Consumer attempt should kill itself instead of failing during validation checks with final merge avoidance
[ https://issues.apache.org/jira/browse/TEZ-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989677#comment-15989677 ] Harish Jaiprakash commented on TEZ-3700: * killSelf is not using message being sent as argument. * Should we use ConcurrentHashMap instead of synchronizedMap? * We do not remove from failedTaskIds, is that alright? > Consumer attempt should kill itself instead of failing during validation > checks with final merge avoidance > -- > > Key: TEZ-3700 > URL: https://issues.apache.org/jira/browse/TEZ-3700 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3700.1.patch, TEZ-3700.2.patch, TEZ-3700.3.patch > > > Currently when if data is received from different attempts with final merge > disabled (with/without pipleining), consumer attempt ends up with failure. > Instead it should issue kill request so that the job deos not end up with > failures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3703) Use a sha comparison to compare vertex and dag resources, if there is a mismatch
[ https://issues.apache.org/jira/browse/TEZ-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989669#comment-15989669 ] Harish Jaiprakash commented on TEZ-3703: Since we have 1.7 as the java version, should we start using try with resources to simplify code? > Use a sha comparison to compare vertex and dag resources, if there is a > mismatch > > > Key: TEZ-3703 > URL: https://issues.apache.org/jira/browse/TEZ-3703 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-3703.01.patch, TEZ-3703.02.patch > > > In DAG validate, when resources are compared between Dag and Vertices - > instead of failing immediately on conflicting LR.equals, fall back to a sha > comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (TEZ-3706) add option to skip Tez UI build
Sergey Shelukhin created TEZ-3706: - Summary: add option to skip Tez UI build Key: TEZ-3706 URL: https://issues.apache.org/jira/browse/TEZ-3706 Project: Apache Tez Issue Type: Bug Reporter: Sergey Shelukhin The UI build takes forever downloading some files and messing around. It should be possible to skip it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (TEZ-3705) Modify DeletionTracker and deletion threads to be initialized only if enabled for tez_shuffle
[ https://issues.apache.org/jira/browse/TEZ-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3705: - Description: Right now the Deletion Tracker and the associated threads are initialized irrespective of the deletion service being used/enabled or not. This can be made better. > Modify DeletionTracker and deletion threads to be initialized only if enabled > for tez_shuffle > - > > Key: TEZ-3705 > URL: https://issues.apache.org/jira/browse/TEZ-3705 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > > Right now the Deletion Tracker and the associated threads are initialized > irrespective of the deletion service being used/enabled or not. This can be > made better. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (TEZ-3705) Modify DeletionTracker and deletion threads to be initialized only if enabled for tez_shuffle
Kuhu Shukla created TEZ-3705: Summary: Modify DeletionTracker and deletion threads to be initialized only if enabled for tez_shuffle Key: TEZ-3705 URL: https://issues.apache.org/jira/browse/TEZ-3705 Project: Apache Tez Issue Type: Sub-task Reporter: Kuhu Shukla Assignee: Kuhu Shukla -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3703) Use a sha comparison to compare vertex and dag resources, if there is a mismatch
[ https://issues.apache.org/jira/browse/TEZ-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989509#comment-15989509 ] Zhiyuan Yang commented on TEZ-3703: --- +1 > Use a sha comparison to compare vertex and dag resources, if there is a > mismatch > > > Key: TEZ-3703 > URL: https://issues.apache.org/jira/browse/TEZ-3703 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-3703.01.patch, TEZ-3703.02.patch > > > In DAG validate, when resources are compared between Dag and Vertices - > instead of failing immediately on conflicting LR.equals, fall back to a sha > comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3696) Jobs can hang when both concurrency and speculation are enabled
[ https://issues.apache.org/jira/browse/TEZ-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989447#comment-15989447 ] TezQA commented on TEZ-3696: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865607/TEZ-3696.003.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2404//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2404//console This message is automatically generated. > Jobs can hang when both concurrency and speculation are enabled > --- > > Key: TEZ-3696 > URL: https://issues.apache.org/jira/browse/TEZ-3696 > Project: Apache Tez > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: TEZ-3696.001.patch, TEZ-3696.002.patch, > TEZ-3696.003.patch > > > We can reproduce the hung job by doing the following: > 1. Run a sleep job with a concurrency of 1, speculation enabled, and 3 tasks > {noformat} > HADOOP_CLASSPATH="$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_CONF_DIR" yarn jar > $TEZ_HOME/tez-tests-*.jar mrrsleep -Dtez.am.vertex.max-task-concurrency=1 > -Dtez.am.speculation.enabled=true -Dtez.task.timeout-ms=6 -m 3 -mt 6 > -ir 0 -irt 0 -r 0 -rt 0 > {noformat} > 2. Let the 1st task run to completion and then stop the 2nd task so that a > speculative attempt is scheduled. Once the speculative attempt is scheduled > for the 2nd task, continue the original attempt and let it complete. > {noformat} > kill -STOP > // wait a few seconds for a speculative attempt to kick off > kill -CONT > {noformat} > 3. Kill the 3rd task, which will create a 2nd attempt > {noformat} > kill -9 > {noformat} > 4. The next thing to be drawn off of the queue will be the speculative > attempt of the 2nd task. However, it is already completed, so it will just > sit in the final state and the job will hang. > Basically, for the failure to happen, the number of speculative tasks that > are scheduled, but not yet ran has to be >= the concurrency of the job and > there has to be at least 1 task failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Success: TEZ-3696 PreCommit Build #2404
Jira: https://issues.apache.org/jira/browse/TEZ-3696 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2404/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 330.55 KB...] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 54:55 min [INFO] Finished at: 2017-04-28T20:52:27+00:00 [INFO] Final Memory: 94M/1441M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865607/TEZ-3696.003.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2404//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2404//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. bbd62ebfbcf418c6a1f8a250bc33c65a02fa1045 logged out == == Finished build. == == Archiving artifacts Compressed 3.42 MB of artifacts by 24.7% relative to #2402 [description-setter] Description set: TEZ-3696 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3704) Tez-UI unit test failing
[ https://issues.apache.org/jira/browse/TEZ-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989387#comment-15989387 ] Jonathan Eagles commented on TEZ-3704: -- Please reach out to the user list to get the best support for this kind of issue. u...@tez.apache.org In addition please verify the correct build environment via BUILDING.txt file in the root git directory. > Tez-UI unit test failing > > > Key: TEZ-3704 > URL: https://issues.apache.org/jira/browse/TEZ-3704 > Project: Apache Tez > Issue Type: Bug >Reporter: Yesha Vora > > tez-ui unit test is failing as below. > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] tez SUCCESS [ 1.171 > s] > [INFO] tez-api SUCCESS [ 25.416 > s] > [INFO] tez-common . SUCCESS [ 0.156 > s] > [INFO] tez-runtime-internals .. SUCCESS [ 0.812 > s] > [INFO] tez-runtime-library SUCCESS [ 1.190 > s] > [INFO] tez-mapreduce .. SUCCESS [ 2.787 > s] > [INFO] tez-examples ... SUCCESS [ 0.127 > s] > [INFO] tez-dag SUCCESS [ 4.707 > s] > [INFO] tez-tests .. SUCCESS [ 7.205 > s] > [INFO] tez-ui . FAILURE [01:29 > min] > [INFO] tez-plugins SKIPPED > [INFO] tez-yarn-timeline-history .. SKIPPED > [INFO] tez-history-parser . SKIPPED > [INFO] tez-yarn-timeline-history-with-acls SKIPPED > [INFO] tez-yarn-timeline-cache-plugin . SKIPPED > [INFO] tez-yarn-timeline-history-with-fs .. SKIPPED > [INFO] tez-tools .. SKIPPED > [INFO] tez-perf-analyzer .. SKIPPED > [INFO] tez-job-analyzer ... SKIPPED > [INFO] tez-dist ... SKIPPED > [INFO] Tez SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 02:14 min > [INFO] Finished at: 2017-04-19T19:31:02+00:00 > [INFO] Final Memory: 51M/885M > [INFO] > > [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec > (ember test) on project tez-ui: Command execution failed. Process exited with > an error: 1 (Exit value: 1) -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :tez-ui{code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (TEZ-3704) Tez-UI unit test failing
Yesha Vora created TEZ-3704: --- Summary: Tez-UI unit test failing Key: TEZ-3704 URL: https://issues.apache.org/jira/browse/TEZ-3704 Project: Apache Tez Issue Type: Bug Reporter: Yesha Vora tez-ui unit test is failing as below. {code} [INFO] [INFO] Reactor Summary: [INFO] [INFO] tez SUCCESS [ 1.171 s] [INFO] tez-api SUCCESS [ 25.416 s] [INFO] tez-common . SUCCESS [ 0.156 s] [INFO] tez-runtime-internals .. SUCCESS [ 0.812 s] [INFO] tez-runtime-library SUCCESS [ 1.190 s] [INFO] tez-mapreduce .. SUCCESS [ 2.787 s] [INFO] tez-examples ... SUCCESS [ 0.127 s] [INFO] tez-dag SUCCESS [ 4.707 s] [INFO] tez-tests .. SUCCESS [ 7.205 s] [INFO] tez-ui . FAILURE [01:29 min] [INFO] tez-plugins SKIPPED [INFO] tez-yarn-timeline-history .. SKIPPED [INFO] tez-history-parser . SKIPPED [INFO] tez-yarn-timeline-history-with-acls SKIPPED [INFO] tez-yarn-timeline-cache-plugin . SKIPPED [INFO] tez-yarn-timeline-history-with-fs .. SKIPPED [INFO] tez-tools .. SKIPPED [INFO] tez-perf-analyzer .. SKIPPED [INFO] tez-job-analyzer ... SKIPPED [INFO] tez-dist ... SKIPPED [INFO] Tez SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 02:14 min [INFO] Finished at: 2017-04-19T19:31:02+00:00 [INFO] Final Memory: 51M/885M [INFO] [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec (ember test) on project tez-ui: Command execution failed. Process exited with an error: 1 (Exit value: 1) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-ui{code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3703) Use a sha comparison to compare vertex and dag resources, if there is a mismatch
[ https://issues.apache.org/jira/browse/TEZ-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989349#comment-15989349 ] TezQA commented on TEZ-3703: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865597/TEZ-3703.02.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2403//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2403//console This message is automatically generated. > Use a sha comparison to compare vertex and dag resources, if there is a > mismatch > > > Key: TEZ-3703 > URL: https://issues.apache.org/jira/browse/TEZ-3703 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-3703.01.patch, TEZ-3703.02.patch > > > In DAG validate, when resources are compared between Dag and Vertices - > instead of failing immediately on conflicting LR.equals, fall back to a sha > comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Failed: TEZ-3703 PreCommit Build #2403
Jira: https://issues.apache.org/jira/browse/TEZ-3703 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2403/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 322.09 KB...] [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-runtime-library [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865597/TEZ-3703.02.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2403//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2403//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 0c96bae18236194d849526da7775918db71aee50 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testMultipleSpills[test[true, NONE, {2}]] Error Message: Task com.google.common.util.concurrent.ListenableFutureTask@7f422265 rejected from java.util.concurrent.ThreadPoolExecutor@4ee05628[Running, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 1] Stack Trace: java.util.concurrent.RejectedExecutionException: Task com.google.common.util.concurrent.ListenableFutureTask@7f422265 rejected from java.util.concurrent.ThreadPoolExecutor@4ee05628[Running, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) at com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440) at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56) at org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.setupNextBuffer(UnorderedPartitionedKVWriter.java:384) at org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:333) at org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:284) at org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.baseTest(TestUnorderedPartitionedKVWriter.java:749) at org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testMultipleSpills(TestUnorderedPartitionedKVWriter.java:208)
[jira] [Updated] (TEZ-3696) Jobs can hang when both concurrency and speculation are enabled
[ https://issues.apache.org/jira/browse/TEZ-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3696: - Attachment: TEZ-3696.003.patch Thanks, [~jeagles] for the review! I fixed the patch in regards to your comments and uploaded a new one > Jobs can hang when both concurrency and speculation are enabled > --- > > Key: TEZ-3696 > URL: https://issues.apache.org/jira/browse/TEZ-3696 > Project: Apache Tez > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: TEZ-3696.001.patch, TEZ-3696.002.patch, > TEZ-3696.003.patch > > > We can reproduce the hung job by doing the following: > 1. Run a sleep job with a concurrency of 1, speculation enabled, and 3 tasks > {noformat} > HADOOP_CLASSPATH="$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_CONF_DIR" yarn jar > $TEZ_HOME/tez-tests-*.jar mrrsleep -Dtez.am.vertex.max-task-concurrency=1 > -Dtez.am.speculation.enabled=true -Dtez.task.timeout-ms=6 -m 3 -mt 6 > -ir 0 -irt 0 -r 0 -rt 0 > {noformat} > 2. Let the 1st task run to completion and then stop the 2nd task so that a > speculative attempt is scheduled. Once the speculative attempt is scheduled > for the 2nd task, continue the original attempt and let it complete. > {noformat} > kill -STOP > // wait a few seconds for a speculative attempt to kick off > kill -CONT > {noformat} > 3. Kill the 3rd task, which will create a 2nd attempt > {noformat} > kill -9 > {noformat} > 4. The next thing to be drawn off of the queue will be the speculative > attempt of the 2nd task. However, it is already completed, so it will just > sit in the final state and the job will hang. > Basically, for the failure to happen, the number of speculative tasks that > are scheduled, but not yet ran has to be >= the concurrency of the job and > there has to be at least 1 task failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3696) Jobs can hang when both concurrency and speculation are enabled
[ https://issues.apache.org/jira/browse/TEZ-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989306#comment-15989306 ] Jonathan Eagles commented on TEZ-3696: -- [~ebadger], patch looks good. A few minor nits {code} sendDAGSchedulerFinishedEvent(attemptId); // not a retro active action {code} Seems like the comment is out of sync with the behavior now since an event is sent regardless of the internal state - Tests look good. Can you please add a timeout consistent with the tests [~bikassaha], it would be good to understand the original intent on not sending the DAGSchedulerFinishedEvent to the scheduler in this case. If we don't hear something by early next week, we'll assume there is nothing to worry about with this change. > Jobs can hang when both concurrency and speculation are enabled > --- > > Key: TEZ-3696 > URL: https://issues.apache.org/jira/browse/TEZ-3696 > Project: Apache Tez > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: TEZ-3696.001.patch, TEZ-3696.002.patch > > > We can reproduce the hung job by doing the following: > 1. Run a sleep job with a concurrency of 1, speculation enabled, and 3 tasks > {noformat} > HADOOP_CLASSPATH="$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_CONF_DIR" yarn jar > $TEZ_HOME/tez-tests-*.jar mrrsleep -Dtez.am.vertex.max-task-concurrency=1 > -Dtez.am.speculation.enabled=true -Dtez.task.timeout-ms=6 -m 3 -mt 6 > -ir 0 -irt 0 -r 0 -rt 0 > {noformat} > 2. Let the 1st task run to completion and then stop the 2nd task so that a > speculative attempt is scheduled. Once the speculative attempt is scheduled > for the 2nd task, continue the original attempt and let it complete. > {noformat} > kill -STOP > // wait a few seconds for a speculative attempt to kick off > kill -CONT > {noformat} > 3. Kill the 3rd task, which will create a 2nd attempt > {noformat} > kill -9 > {noformat} > 4. The next thing to be drawn off of the queue will be the speculative > attempt of the 2nd task. However, it is already completed, so it will just > sit in the final state and the job will hang. > Basically, for the failure to happen, the number of speculative tasks that > are scheduled, but not yet ran has to be >= the concurrency of the job and > there has to be at least 1 task failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (TEZ-3703) Use a sha comparison to compare vertex and dag resources, if there is a mismatch
[ https://issues.apache.org/jira/browse/TEZ-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-3703: Attachment: TEZ-3703.02.patch Updated patch. > Use a sha comparison to compare vertex and dag resources, if there is a > mismatch > > > Key: TEZ-3703 > URL: https://issues.apache.org/jira/browse/TEZ-3703 > Project: Apache Tez > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: TEZ-3703.01.patch, TEZ-3703.02.patch > > > In DAG validate, when resources are compared between Dag and Vertices - > instead of failing immediately on conflicting LR.equals, fall back to a sha > comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3698) UnorderedKV writer should be able to honor tez.runtime.enable.final-merge.in.output without pipelinedshuffle
[ https://issues.apache.org/jira/browse/TEZ-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988977#comment-15988977 ] Rajesh Balamohan commented on TEZ-3698: --- Thanks for the review [~aplusplus]. Will upload revised patch soon. PipelinedShuffle has some corner cases to be addressed w.r.t to exception handling. While it would eventually be addressed in subsequent jiras, it would be beneficial to turn off final merge which is independent of pipelining. > UnorderedKV writer should be able to honor > tez.runtime.enable.final-merge.in.output without pipelinedshuffle > > > Key: TEZ-3698 > URL: https://issues.apache.org/jira/browse/TEZ-3698 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3698.1.patch > > > Final merge can be disabled with "tez.runtime.enable.final-merge.in.output" > setting. Currently this works with UnorderedKV writer only with pipelined > shuffle. It should be able to honor this parameter, without pipelined shuffle > as well to avoid final merge. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3701) UnorderedPartitionedKVWriter does not wait for submitted tasks to finish before finalMerge.
[ https://issues.apache.org/jira/browse/TEZ-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988685#comment-15988685 ] Jason Lowe commented on TEZ-3701: - I'm fine with updating this JIRA, but if you feel it would be cleaner to start with a fresh one that works too. > UnorderedPartitionedKVWriter does not wait for submitted tasks to finish > before finalMerge. > --- > > Key: TEZ-3701 > URL: https://issues.apache.org/jira/browse/TEZ-3701 > Project: Apache Tez > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Rajesh Balamohan >Priority: Blocker > > UnorderedPartitionedKVWriter add task to the executor, but does not wait for > them to finish before starting the final merge. This can cause finalMerge to > fail or write incorrect data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3673) Allocate smaller buffers in UnorderedPartitionedKVWriter
[ https://issues.apache.org/jira/browse/TEZ-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988306#comment-15988306 ] TezQA commented on TEZ-3673: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865463/TEZ-3673.04.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2402//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2402//console This message is automatically generated. > Allocate smaller buffers in UnorderedPartitionedKVWriter > > > Key: TEZ-3673 > URL: https://issues.apache.org/jira/browse/TEZ-3673 > Project: Apache Tez > Issue Type: Improvement >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: TEZ-3673.01.patch, TEZ-3673.02.patch, TEZ-3673.03.patch, > TEZ-3673.04.patch > > > UnorderedPartitionedKVWriter allocates in bigger chunks. It may or may not > get filled up. In PipelinedSorter, we start off with 32MB chunks. But > UnorderedPartitionedKVWriter can be worse as it allocates bigger blocks. Need > to revisit this allocation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Success: TEZ-3673 PreCommit Build #2402
Jira: https://issues.apache.org/jira/browse/TEZ-3673 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2402/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 329.39 KB...] [INFO] Tez SUCCESS [ 0.042 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 51:37 min [INFO] Finished at: 2017-04-28T06:48:30+00:00 [INFO] Final Memory: 98M/1300M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12865463/TEZ-3673.04.patch against master revision 247719d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2402//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2402//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. bc3f26bfd31d234328855757486c63b4d77ee23d logged out == == Finished build. == == Archiving artifacts [description-setter] Description set: TEZ-3673 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed