[ 
https://issues.apache.org/jira/browse/YARN-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238113#comment-16238113
 ] 

Hadoop QA commented on YARN-6078:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m  
0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-6078 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895888/YARN-6078.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d9dc69378b78 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c417284 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/18339/testReport/ |
| Max. process+thread count | 335 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18339/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Containers stuck in Localizing state
> ------------------------------------
>
>                 Key: YARN-6078
>                 URL: https://issues.apache.org/jira/browse/YARN-6078
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jagadish
>            Assignee: Billie Rinaldi
>            Priority: Major
>         Attachments: YARN-6078.001.patch
>
>
> I encountered an interesting issue in one of our Yarn clusters (where the 
> containers are stuck in localizing phase).
> Our AM requests a container, and starts a process using the NMClient.
> According to the NM the container is in LOCALIZING state:
> {code}
> 1. 2017-01-09 22:06:18,362 [INFO] [AsyncDispatcher event handler] 
> container.ContainerImpl.handle(ContainerImpl.java:1135) - Container 
> container_e03_1481261762048_0541_02_000060 transitioned from NEW to LOCALIZING
> 2017-01-09 22:06:18,363 [INFO] [AsyncDispatcher event handler] 
> localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:711)
>  - Created localizer for container_e03_1481261762048_0541_02_000060
> 2017-01-09 22:06:18,364 [INFO] [LocalizerRunner for 
> container_e03_1481261762048_0541_02_000060] 
> localizer.ResourceLocalizationService$LocalizerRunner.writeCredentials(ResourceLocalizationService.java:1191)
>  - Writing credentials to the nmPrivate file 
> /../..//.nmPrivate/container_e03_1481261762048_0541_02_000060.tokens. 
> Credentials list:
> {code}
> According to the RM the container is in RUNNING state:
> {code}
> 2017-01-09 22:06:17,110 [INFO] [IPC Server handler 19 on 8030] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_000060 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2017-01-09 22:06:19,084 [INFO] [ResourceManager Event Processor] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_000060 Container Transitioned from 
> ACQUIRED to RUNNING
> {code}
> When I click the Yarn RM UI to view the logs for the container,  I get an 
> error
> that
> {code}
> No logs were found. state is LOCALIZING
> {code}
> The Node manager 's stack trace seems to indicate that the NM's 
> LocalizerRunner is stuck waiting to read from the sub-process's outputstream.
> {code}
> "LocalizerRunner for container_e03_1481261762048_0541_02_000060" #27007081 
> prio=5 os_prio=0 tid=0x00007fa518849800 nid=0x15f7 runnable 
> [0x00007fa5076c3000]
>    java.lang.Thread.State: RUNNABLE
>       at java.io.FileInputStream.readBytes(Native Method)
>       at java.io.FileInputStream.read(FileInputStream.java:255)
>       at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>       - locked <0x00000000c6dc9c50> (a 
> java.lang.UNIXProcess$ProcessPipeInputStream)
>       at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>       at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>       at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>       - locked <0x00000000c6dc9c78> (a java.io.InputStreamReader)
>       at java.io.InputStreamReader.read(InputStreamReader.java:184)
>       at java.io.BufferedReader.fill(BufferedReader.java:161)
>       at java.io.BufferedReader.read1(BufferedReader.java:212)
>       at java.io.BufferedReader.read(BufferedReader.java:286)
>       - locked <0x00000000c6dc9c78> (a java.io.InputStreamReader)
>       at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:786)
>       at org.apache.hadoop.util.Shell.runCommand(Shell.java:568)
>       at org.apache.hadoop.util.Shell.run(Shell.java:479)
>       at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:237)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1113)
> {code}
> I did a {code}ps aux{code} and confirmed that there was no container-executor 
> process running with INITIALIZE_CONTAINER that the localizer starts. It seems 
> that the output stream pipe of the process is still not closed (even though 
> the localizer process is no longer present).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to