[jira] [Commented] (YARN-7737) prelaunch.err file not found exception on container failure

2018-01-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16338232#comment-16338232
 ] 

Hudson commented on YARN-7737:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13550 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13550/])
YARN-7737. prelaunch.err file not found exception on container failure. (zhz: 
rev fa8cf4d1b4896a602dc383d5e266768392a9790c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java


> prelaunch.err file not found exception on container failure
> ---
>
> Key: YARN-7737
> URL: https://issues.apache.org/jira/browse/YARN-7737
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1, 3.0.1
>Reporter: Jonathan Hung
>Assignee: Keqiu Hu
>Priority: Major
> Attachments: YARN-7737.001.patch
>
>
> Hit this exception when a container failed:{noformat}2018-01-11 19:04:08,036 
> ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to get tail of the container's prelaunch error log file
> java.io.FileNotFoundException: File 
> /grid/b/tmp/userlogs/application_1515190594800_1766/container_e39_1515190594800_1766_01_02/prelaunch.err
>  does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitWithFailure(ContainerLaunch.java:545)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitCode(ContainerLaunch.java:511)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:93)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){noformat}
> containerLogDir is picked on container launch via 
> {{LocalDirAllocator#getLocalPathForWrite}}, which is where it looks for 
> {{prelaunch.err}} when the container fails. But prelaunch.err (and 
> prelaunch.out) are created in the first log dir (in {{ContainerLaunch#call}}: 
> {noformat}exec.writeLaunchEnv(containerScriptOutStream, environment,
> localResources, launchContext.getCommands(),
> new Path(containerLogDirs.get(0)), user);{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7737) prelaunch.err file not found exception on container failure

2018-01-23 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336330#comment-16336330
 ] 

Jonathan Hung commented on YARN-7737:
-

+1 LGTM, thanks!

> prelaunch.err file not found exception on container failure
> ---
>
> Key: YARN-7737
> URL: https://issues.apache.org/jira/browse/YARN-7737
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1, 3.0.1
>Reporter: Jonathan Hung
>Assignee: Keqiu Hu
>Priority: Major
> Attachments: YARN-7737.001.patch
>
>
> Hit this exception when a container failed:{noformat}2018-01-11 19:04:08,036 
> ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to get tail of the container's prelaunch error log file
> java.io.FileNotFoundException: File 
> /grid/b/tmp/userlogs/application_1515190594800_1766/container_e39_1515190594800_1766_01_02/prelaunch.err
>  does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitWithFailure(ContainerLaunch.java:545)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitCode(ContainerLaunch.java:511)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:93)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){noformat}
> containerLogDir is picked on container launch via 
> {{LocalDirAllocator#getLocalPathForWrite}}, which is where it looks for 
> {{prelaunch.err}} when the container fails. But prelaunch.err (and 
> prelaunch.out) are created in the first log dir (in {{ContainerLaunch#call}}: 
> {noformat}exec.writeLaunchEnv(containerScriptOutStream, environment,
> localResources, launchContext.getCommands(),
> new Path(containerLogDirs.get(0)), user);{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7737) prelaunch.err file not found exception on container failure

2018-01-19 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332614#comment-16332614
 ] 

Zhe Zhang commented on YARN-7737:
-

+1, looks to me a clear fix. Will wait for [~jhung] to take a look before 
committing.

> prelaunch.err file not found exception on container failure
> ---
>
> Key: YARN-7737
> URL: https://issues.apache.org/jira/browse/YARN-7737
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Keqiu Hu
>Priority: Major
> Attachments: YARN-7737.001.patch
>
>
> Hit this exception when a container failed:{noformat}2018-01-11 19:04:08,036 
> ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to get tail of the container's prelaunch error log file
> java.io.FileNotFoundException: File 
> /grid/b/tmp/userlogs/application_1515190594800_1766/container_e39_1515190594800_1766_01_02/prelaunch.err
>  does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitWithFailure(ContainerLaunch.java:545)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitCode(ContainerLaunch.java:511)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:93)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){noformat}
> containerLogDir is picked on container launch via 
> {{LocalDirAllocator#getLocalPathForWrite}}, which is where it looks for 
> {{prelaunch.err}} when the container fails. But prelaunch.err (and 
> prelaunch.out) are created in the first log dir (in {{ContainerLaunch#call}}: 
> {noformat}exec.writeLaunchEnv(containerScriptOutStream, environment,
> localResources, launchContext.getCommands(),
> new Path(containerLogDirs.get(0)), user);{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7737) prelaunch.err file not found exception on container failure

2018-01-18 Thread Keqiu Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331745#comment-16331745
 ] 

Keqiu Hu commented on YARN-7737:


The fix is trivial, it is acceptable that no new UT is needed. 
_containerLogDir_ is created by _LocalDirsHandlerService_'s 
_getLogPathForWrite()_ in _call_() method before passed to the 
_ContainerExecutor_.

*Verification*

After applying the patch, the exception is not throwing anymore with multiple 
user log partitions. 

 

> prelaunch.err file not found exception on container failure
> ---
>
> Key: YARN-7737
> URL: https://issues.apache.org/jira/browse/YARN-7737
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Keqiu Hu
>Priority: Major
> Attachments: YARN-7737.001.patch
>
>
> Hit this exception when a container failed:{noformat}2018-01-11 19:04:08,036 
> ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to get tail of the container's prelaunch error log file
> java.io.FileNotFoundException: File 
> /grid/b/tmp/userlogs/application_1515190594800_1766/container_e39_1515190594800_1766_01_02/prelaunch.err
>  does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitWithFailure(ContainerLaunch.java:545)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.handleContainerExitCode(ContainerLaunch.java:511)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:93)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){noformat}
> containerLogDir is picked on container launch via 
> {{LocalDirAllocator#getLocalPathForWrite}}, which is where it looks for 
> {{prelaunch.err}} when the container fails. But prelaunch.err (and 
> prelaunch.out) are created in the first log dir (in {{ContainerLaunch#call}}: 
> {noformat}exec.writeLaunchEnv(containerScriptOutStream, environment,
> localResources, launchContext.getCommands(),
> new Path(containerLogDirs.get(0)), user);{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7737) prelaunch.err file not found exception on container failure

2018-01-18 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331572#comment-16331572
 ] 

genericqa commented on YARN-7737:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 34s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
21s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7737 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12906728/YARN-7737.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 49986945081b 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 37f4696 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19334/testReport/ |
| Max. process+thread count | 440 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19334/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT