Perfect Jeff, I clearly understand. After changing the setup to the appropriate users and folder permissions, I can see some progress..
Cheers.. On Fri, Feb 15, 2019 at 10:05 AM Jeff Hubbs <jhubbsl...@att.net> wrote: > On 2/14/19 11:09 PM, Vinay Kashyap wrote: > > I am running hadoop on my mac and all the folders have *myuser:staff* as > the owner. I have verified the permissions for the local dirs to be 755. > > This doesn't sound right. By-the-book, there are supposed to be separate > "users" for hdfs, yarn, and mapred to run their respective daemons. The > directories they read/write in are supposed to be permed and owned to > expect that. One possible approach for purposes of log-writing etc. is to > put those user accounts in a group (perhaps named "hadoop") so that > read/written areas in common are owned by that group and permed accordingly. > > If you're going to ad-lib that arrangement then you'll have to ad-lib a > lot of the rest of how worker nodes and edge nodes behave accordingly. > > I run all hadoop services with myuser and I have configured > *yarn.nodemanager.linux-container-executor.group**=staff *accordingly > both in *yarn-site.xml* and *container-executor.cfg* > > 1. Is the container-executor binary certified to work as expected on OSX.? > 2. When linux container executor is configured, is there any hard > expectation that users of the running hadoop services to be part of [*root, > hdfs, yarn...*] and group to be *hadoop*.? So that the directory > permissions fall in line accordingly? > > Can you please help me understand this.? Could not find any write up on > this. > > On Thu, Feb 14, 2019 at 11:13 PM Prabhu Josephraj <pjos...@cloudera.com> > wrote: > >> In case of Distributed Shell Job - ApplicationMaster runs in normal linux >> container and the subsequent shell command runs inside Docker >> container. The job fails even before launching AM, that is before >> starting Docker Container. I think the Distributed Shell job will fail even >> without Docker Settings. >> >> As per the error code 20 , it is mostly related to accessing of NM local >> directory. >> >> >> https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_sg_yarn_container_exec_errors.html >> >> 20 >> >> INITIALIZE_USER_FAILED >> >> Couldn't get, stat, or secure the per-user NodeManager directory. >> >> Can we try below steps on (all) NodeManager machine. >> >> Remove all contents under /data/yarn and make sure the /data and >> /data/yarn directory permission is 755 with owner root:root and local >> directory >> is owned by yarn:hadoop. >> >> [root@tparimi-tarunhdp26-4 ~]# ls -lrt / >> drwxr-xr-x. 5 root root 44 Oct 24 11:47 data >> >> [root@tparimi-tarunhdp26-4 ~]# ls -lrt /data/ >> drwxr-xr-x. 4 root root 28 Oct 24 14:30 yarn >> >> [root@tparimi-tarunhdp26-4 ~]# ls -lrt /data/yarn/ >> total 4 >> drwxr-xr-x. 5 yarn hadoop 54 Feb 14 17:32 local >> drwxrwxr-x. 10 yarn hadoop 4096 Feb 14 17:32 log >> >> And also check if Distributed Shell jobs runs fine without Docker >> Settings. >> >> >> >> >> >> On Thu, Feb 14, 2019 at 10:15 PM Vinay Kashyap <vinu.k...@gmail.com> >> wrote: >> >>> Hi Prabhu, >>> >>> Thanks for your reply. >>> I tried the configurations as per your suggestion. But I get the >>> same error. >>> Is this related to container localization by any chance?. >>> Also, is there any log or out information which says that the docker >>> container runtime has been picked up.? >>> >>> >>> >>> On Thu, Feb 14, 2019 at 9:38 PM Prabhu Josephraj <pjos...@cloudera.com> >>> wrote: >>> >>>> Hi Vinay, >>>> >>>> Can you try specifying below configs under Docker section in >>>> container-executor.cfg which will allow Docker Containers to use the NM >>>> Local Dirs. >>>> >>>> >>>> docker.allowed.ro-mounts=/data/yarn/local,,/usr/jdk64/jdk1.8.0_112/bin >>>> docker.allowed.rw-mounts=/data/yarn/local,/data/yarn/log >>>> >>>> Thanks, >>>> Prabhu Joseph >>>> >>>> On Thu, Feb 14, 2019 at 9:28 PM Vinay Kashyap <vinu.k...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> I am using Hadoop 3.2.0 and trying to run a simple application in a >>>>> docker container and I have made the required configuration changes both >>>>> in >>>>> *yarn-site.xml* and *container-executor.cfg* to choose >>>>> LinuxContainerExecutor and docker runtime. >>>>> >>>>> I use the example of distributed shell in one of the hortonworks blog. >>>>> https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/ >>>>> >>>>> The problem I face here is when the application is submitted to YARN >>>>> it fails with a reason related to directory creation issue with the below >>>>> error >>>>> >>>>> 2019-02-14 20:51:16,450 INFO distributedshell.Client: Got application >>>>> report from ASM for, appId=2, clientToAMToken=null, >>>>> appDiagnostics=Application application_1550156488785_0002 failed 2 times >>>>> due to AM Container for appattempt_1550156488785_0002_000002 exited with >>>>> exitCode: -1000 Failing this attempt.Diagnostics: [2019-02-14 >>>>> 20:51:16.282]Application application_1550156488785_0002 initialization >>>>> failed (exitCode=20) with output: main : command provided 0 main : user is >>>>> myuser main : requested yarn user is myuser Failed to create directory >>>>> /data/yarn/local/nmPrivate/container_1550156488785_0002_02_000001.tokens/usercache/myuser >>>>> - Not a directory >>>>> >>>>> I have configured *yarn.nodemanager.local-dirs* in yarn-site.xml and >>>>> I can see the same reflected in YARN web ui *localhost:8088/conf* >>>>> >>>>> <property> >>>>> <name>yarn.nodemanager.local-dirs</name> >>>>> <value>/data/yarn/local</value> >>>>> <final>false</final> >>>>> <source>yarn-site.xml</source> >>>>> </property> >>>>> >>>>> I do not understand why is it trying to create usercache dir inside >>>>> the nmPrivate directory. >>>>> >>>>> Note : I have verified the permissions for myuser to the directories >>>>> and also have tried clearing the directories manually as suggested in a >>>>> related post. But no fruit. I do not see any additional information about >>>>> container launch failure in any other logs. >>>>> >>>>> How do I debug why the usercache dir is not resolved properly?? >>>>> >>>>> Really appreciate any help on this. >>>>> >>>>> Thanks >>>>> >>>>> Vinay Kashyap >>>>> >>>> >>> >>> -- >>> *Thanks and regards* >>> *Vinay Kashyap* >>> >> > > -- > *Thanks and regards* > *Vinay Kashyap* > > > -- *Thanks and regards* *Vinay Kashyap*