[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG
[ https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505377#comment-16505377 ] TezQA commented on TEZ-3951: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12926965/TEZ-3951.01.patch against master revision 9058460. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//console This message is automatically generated. > TezClient wait too long for the DAGClient for prewarm; tries to shut down the > wrong DAG > --- > > Key: TEZ-3951 > URL: https://issues.apache.org/jira/browse/TEZ-3951 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3951.01.patch, TEZ-3951.patch > > > Follow-up from TEZ-3943 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: TEZ-3951 PreCommit Build #2834
Jira: https://issues.apache.org/jira/browse/TEZ-3951 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2834/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 379.88 KB...] [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-runtime-library [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12926965/TEZ-3951.01.patch against master revision 9058460. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 10 tests failed. FAILED: org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle[test[false, DISABLED]] Error Message: test timed out after 1 milliseconds Stack Trace: java.lang.Exception: test timed out after 1 milliseconds at java.io.FileDescriptor.sync(Native Method) at org.apache.hadoop.util.DiskChecker.diskIoCheckWithoutNativeIo(DiskChecker.java:249) at org.apache.hadoop.util.DiskChecker.doDiskIo(DiskChecker.java:220) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:82) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:351) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:426) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:133) at org.apache.tez.runtime.library.common.task.local.output.TezTaskOutputFiles.getSpillFileForWrite(TezTaskOutputFiles.java:211) at org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.textTest(TestUnorderedPartitionedKVWriter.java:472) at org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle(TestUnorderedPartitionedKVWriter.java:642) FAILED: org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle[test[false, ENABLED]] Error Message: test timed out after 1 milliseconds Stack Trace: java.lang.Exception: test timed out after 1 milliseconds at java.io.FileDescriptor.sync(Native Method) at org.apache.hadoop.util.DiskChecker.diskIoCheckWithoutNativeIo(DiskChecker.java:249) at
[jira] [Updated] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG
[ https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated TEZ-3951: -- Attachment: TEZ-3951.01.patch > TezClient wait too long for the DAGClient for prewarm; tries to shut down the > wrong DAG > --- > > Key: TEZ-3951 > URL: https://issues.apache.org/jira/browse/TEZ-3951 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3951.01.patch, TEZ-3951.patch > > > Follow-up from TEZ-3943 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG
[ https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505223#comment-16505223 ] Sergey Shelukhin commented on TEZ-3951: --- Added a small test case to test the timeout. > TezClient wait too long for the DAGClient for prewarm; tries to shut down the > wrong DAG > --- > > Key: TEZ-3951 > URL: https://issues.apache.org/jira/browse/TEZ-3951 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3951.01.patch, TEZ-3951.patch > > > Follow-up from TEZ-3943 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG
[ https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505201#comment-16505201 ] Jaume M commented on TEZ-3904: -- Spark seems to [push from the driver|https://github.com/apache/spark/blob/e76b0124fbe463def00b1dffcfd8fd47e04772fe/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/AMCredentialRenewer.scala#L37] new delegation tokens when they are close to expiring. [~sershe] the containers started by Tez where the DAG is running would also have to get new credentials? > an API to update tokens for Tez AM and the DAG > -- > > Key: TEZ-3904 > URL: https://issues.apache.org/jira/browse/TEZ-3904 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Nothing is permanent in this world, lest of all delegation tokens. > The current way around token expiration (the one where you cannot keep > renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez > AM. It may happen though that a query is running at that time, and so the AM > cannot be restarted with new tokens. We let the query run its course and it > usually dies because it tries to do something with an expired token. > To get around that, we cycle AMs a few hours before tokens are going to > expire. > However, that is still not ideal because it puts an upper bound on safe Hive > query runtime (a query longer than 3 hours with current config may fail due > to an expired token if its timing is unlucky), and also precludes setting > tokens to expire much faster than the standard 7-day time frame. > There should be a mechanism to replace tokens in the AM, including for a > running DAG. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG
[ https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505072#comment-16505072 ] Sergey Shelukhin commented on TEZ-3951: --- [~ewohlstadter] [~ashutoshc] ping? > TezClient wait too long for the DAGClient for prewarm; tries to shut down the > wrong DAG > --- > > Key: TEZ-3951 > URL: https://issues.apache.org/jira/browse/TEZ-3951 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3951.patch > > > Follow-up from TEZ-3943 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3944) TestTaskScheduler times-out on Hadoop3
[ https://issues.apache.org/jira/browse/TEZ-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504888#comment-16504888 ] Kuhu Shukla commented on TEZ-3944: -- +1. LGTM. Thank you [~jeagles]! > TestTaskScheduler times-out on Hadoop3 > -- > > Key: TEZ-3944 > URL: https://issues.apache.org/jira/browse/TEZ-3944 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Eric Wohlstadter >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3944.001.patch, TEZ-3944.002.patch, > TEZ-3944.003.patch, org.apache.tez.dag.app.rm.TestTaskScheduler-output.txt > > > TestTaskScheduler times-out intermittently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)