[ 
https://issues.apache.org/jira/browse/YARN-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8403:
----------------------------
    Issue Type: Bug  (was: Sub-task)
        Parent:     (was: YARN-8472)

> Nodemanager logs failed to download file with INFO level
> --------------------------------------------------------
>
>                 Key: YARN-8403
>                 URL: https://issues.apache.org/jira/browse/YARN-8403
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>             Fix For: 3.2.0, 3.1.2
>
>         Attachments: YARN-8403.001.patch, YARN-8403.002.patch, 
> YARN-8403.003.patch, YARN-8403.png
>
>
> Some of the container execution related stack traces are printing in INFO or 
> WARN level. 
> {code}
> 2018-06-06 03:10:40,077 INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:writeCredentials(1312)) - Writing 
> credentials to the nmPrivate file 
> /grid/0/hadoop/yarn/local/nmPrivate/container_e02_1528246317583_0048_01_000001.tokens
> 2018-06-06 03:10:40,087 INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(975)) - Failed to download resource { { 
> hdfs://mycluster.example.com:8020/user/hrt_qa/Streaming/InputDir, 
> 1528254452720, FILE, null 
> },pending,[(container_e02_1528246317583_0048_01_000001)],6074418082915225,DOWNLOADING}
> org.apache.hadoop.yarn.exceptions.YarnException: Download and unpack failed
>         at 
> org.apache.hadoop.yarn.util.FSDownload.downloadAndUnpack(FSDownload.java:306)
>         at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:283)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:409)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:66)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /grid/0/hadoop/yarn/local/filecache/28_tmp/InputDir/input1.txt (Permission 
> denied)
>         at java.io.FileOutputStream.open0(Native Method)
>         at java.io.FileOutputStream.open(FileOutputStream.java:270)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:236)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:219)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:318)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:307)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:338)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:401)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:464)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:408)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:399)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:381)
>         at 
> org.apache.hadoop.yarn.util.FSDownload.downloadAndUnpack(FSDownload.java:298)
>         ... 9 more
> {code}
> {code}
> 2018-06-06 03:10:41,547 WARN  privileged.PrivilegedOperationExecutor 
> (PrivilegedOperationExecutor.java:executePrivilegedOperation(182)) - 
> IOException executing command:
> java.io.InterruptedIOException: java.lang.InterruptedException
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1012)
>         at org.apache.hadoop.util.Shell.run(Shell.java:902)
>         at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:402)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1229)
> Caused by: java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1002)
>         ... 5 more
> 2018-06-06 03:10:41,548 WARN  nodemanager.LinuxContainerExecutor 
> (LinuxContainerExecutor.java:startLocalizer(407)) - Exit code from container 
> container_e02_1528246317583_0048_01_000001 startLocalizer is : -1
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  java.io.InterruptedIOException: java.lang.InterruptedException
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:183)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:402)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1229)
> Caused by: java.io.InterruptedIOException: java.lang.InterruptedException
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1012)
>         at org.apache.hadoop.util.Shell.run(Shell.java:902)
>         at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
>         ... 2 more
> Caused by: java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1002)
>         ... 5 more
> 2018-06-06 03:10:41,548 INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(1249)) - Localizer failed for 
> container_e02_1528246317583_0048_01_000001
> java.io.IOException: Application application_1528246317583_0048 
> initialization failed (exitCode=-1) with output: null
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:411)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1229)
> Caused by: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  java.io.InterruptedIOException: java.lang.InterruptedException
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:183)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:402)
> ... 1 more
> Caused by: java.io.InterruptedIOException: java.lang.InterruptedException
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1012)
>         at org.apache.hadoop.util.Shell.run(Shell.java:902)
>         at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
>         ... 2 more
> Caused by: java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1002)
>         ... 5 more
> {code}
> These logs are only present in NM. ( It does not show up in AM log) 
> These stacktraces are in WARN or INFO level. Ideally, exception should be 
> printed in ERROR log level. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to