[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG

2018-06-07 Thread TezQA (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505377#comment-16505377
 ] 

TezQA commented on TEZ-3951:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12926965/TEZ-3951.01.patch
  against master revision 9058460.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2834//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//console

This message is automatically generated.


> TezClient wait too long for the DAGClient for prewarm; tries to shut down the 
> wrong DAG
> ---
>
> Key: TEZ-3951
> URL: https://issues.apache.org/jira/browse/TEZ-3951
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: TEZ-3951.01.patch, TEZ-3951.patch
>
>
> Follow-up from TEZ-3943



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Failed: TEZ-3951 PreCommit Build #2834

2018-06-07 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3951
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2834/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 379.88 KB...]
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-runtime-library
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12926965/TEZ-3951.01.patch
  against master revision 9058460.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2834//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2834//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==




==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
10 tests failed.
FAILED:  
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle[test[false,
 DISABLED]]

Error Message:
test timed out after 1 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 1 milliseconds
at java.io.FileDescriptor.sync(Native Method)
at 
org.apache.hadoop.util.DiskChecker.diskIoCheckWithoutNativeIo(DiskChecker.java:249)
at org.apache.hadoop.util.DiskChecker.doDiskIo(DiskChecker.java:220)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:82)
at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:351)
at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:426)
at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152)
at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:133)
at 
org.apache.tez.runtime.library.common.task.local.output.TezTaskOutputFiles.getSpillFileForWrite(TezTaskOutputFiles.java:211)
at 
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.textTest(TestUnorderedPartitionedKVWriter.java:472)
at 
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle(TestUnorderedPartitionedKVWriter.java:642)


FAILED:  
org.apache.tez.runtime.library.common.writers.TestUnorderedPartitionedKVWriter.testLargeKvPairs_WithPipelinedShuffle[test[false,
 ENABLED]]

Error Message:
test timed out after 1 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 1 milliseconds
at java.io.FileDescriptor.sync(Native Method)
at 
org.apache.hadoop.util.DiskChecker.diskIoCheckWithoutNativeIo(DiskChecker.java:249)
at 

[jira] [Updated] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG

2018-06-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated TEZ-3951:
--
Attachment: TEZ-3951.01.patch

> TezClient wait too long for the DAGClient for prewarm; tries to shut down the 
> wrong DAG
> ---
>
> Key: TEZ-3951
> URL: https://issues.apache.org/jira/browse/TEZ-3951
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: TEZ-3951.01.patch, TEZ-3951.patch
>
>
> Follow-up from TEZ-3943



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG

2018-06-07 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505223#comment-16505223
 ] 

Sergey Shelukhin commented on TEZ-3951:
---

Added a small test case to test the timeout.

> TezClient wait too long for the DAGClient for prewarm; tries to shut down the 
> wrong DAG
> ---
>
> Key: TEZ-3951
> URL: https://issues.apache.org/jira/browse/TEZ-3951
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: TEZ-3951.01.patch, TEZ-3951.patch
>
>
> Follow-up from TEZ-3943



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-07 Thread Jaume M (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505201#comment-16505201
 ] 

Jaume M commented on TEZ-3904:
--

Spark seems to [push from the 
driver|https://github.com/apache/spark/blob/e76b0124fbe463def00b1dffcfd8fd47e04772fe/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/AMCredentialRenewer.scala#L37]
 new delegation tokens when they are close to expiring. [~sershe] the 
containers started by Tez where the DAG is running would also have to get new 
credentials?

> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3951) TezClient wait too long for the DAGClient for prewarm; tries to shut down the wrong DAG

2018-06-07 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505072#comment-16505072
 ] 

Sergey Shelukhin commented on TEZ-3951:
---

[~ewohlstadter] [~ashutoshc] ping?

> TezClient wait too long for the DAGClient for prewarm; tries to shut down the 
> wrong DAG
> ---
>
> Key: TEZ-3951
> URL: https://issues.apache.org/jira/browse/TEZ-3951
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: TEZ-3951.patch
>
>
> Follow-up from TEZ-3943



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3944) TestTaskScheduler times-out on Hadoop3

2018-06-07 Thread Kuhu Shukla (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504888#comment-16504888
 ] 

Kuhu Shukla commented on TEZ-3944:
--

+1. LGTM. Thank you [~jeagles]!

> TestTaskScheduler times-out on Hadoop3
> --
>
> Key: TEZ-3944
> URL: https://issues.apache.org/jira/browse/TEZ-3944
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Eric Wohlstadter
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3944.001.patch, TEZ-3944.002.patch, 
> TEZ-3944.003.patch, org.apache.tez.dag.app.rm.TestTaskScheduler-output.txt
>
>
> TestTaskScheduler times-out intermittently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)