I am bringing up a Hadoop cluster for the first time (but am an experienced 
sysadmin with lots of cluster experience) and running into an issue with 
permissions on mapred.system.dir.   It has generally been a chore to figure out 
all the various directories that need to be created to get Hadoop working, some 
on the local FS, others within HDFS, getting the right ownership and 
permissions, etc..  I think I am mostly there but can't seem to get past my 
current issue with mapred.system.dir.

Some general info first:
OS: RHEL6
Hadoop version: hadoop-1.0.3-1.x86_64

20 node cluster configured as follows
1 node as primary namenode
1 node as secondary namenode + job tracker
18 nodes as datanode + tasktracker

I have HDFS up and running and have the following in mapred-site.xml:
<property>
  <name>mapred.system.dir</name>
  <value>hdfs://hadoop1/mapred</value>
  <description>Shared data for JT - this must be in HDFS</description>
</property>

I have created this directory in HDFS, owner mapred:hadoop, permissions 700 
which seems to be the most common recommendation amongst multiple, often 
conflicting articles about how to set up Hadoop.  Here is the top level of my 
filesystem:
hyperion-hdp4@hdfs:hadoop fs -ls /
Found 3 items
drwx------   - mapred hadoop          0 2012-10-09 12:58 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user

Note, it doesn't seem to really matter what permissions I set on /mapred since 
when the Jobtracker starts up it changes them to 700.

However, when I try to run the hadoop example teragen program as a "regular" 
user I am getting this error:
hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen 
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
Generating 10000000000 using 2 maps with step of 5000000000
12/10/09 16:27:02 INFO mapred.JobClient: Running job: job_201210072045_0003
12/10/09 16:27:03 INFO mapred.JobClient:  map 0% reduce 0%
12/10/09 16:27:03 INFO mapred.JobClient: Job complete: job_201210072045_0003
12/10/09 16:27:03 INFO mapred.JobClient: Counters: 0
12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job initialization failed:
org.apache.hadoop.security.AccessControlException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=robing, access=EXECUTE, inode="mapred":mapred:hadoop:rwx------
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3251)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
at 
org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
at 
org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207)
at 
org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
<rest of stack trace omitted>

This seems to be saying that is trying to write to the HDFS /mapred filesystem 
as me (robing) rather than as mapred, the username under which the jobtracker 
and tasktracker run.

To verify this is what is happening, I manually changed the permissions on 
/mapred from 700 to 755 since it claims to want execute access:
hyperion-hdp4@mapred:hadoop fs -chmod 755 /mapred
hyperion-hdp4@mapred:hadoop fs -ls /
Found 3 items
drwxr-xr-x   - mapred hadoop          0 2012-10-09 12:58 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
hyperion-hdp4@mapred:

Now I try running again and it fails again, this time complaining it wants 
write access to /mapred:
hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen 
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
Generating 10000000000 using 2 maps with step of 5000000000
12/10/09 16:31:29 INFO mapred.JobClient: Running job: job_201210072045_0005
12/10/09 16:31:30 INFO mapred.JobClient:  map 0% reduce 0%
12/10/09 16:31:30 INFO mapred.JobClient: Job complete: job_201210072045_0005
12/10/09 16:31:30 INFO mapred.JobClient: Counters: 0
12/10/09 16:31:30 INFO mapred.JobClient: Job Failed: Job initialization failed:
org.apache.hadoop.security.AccessControlException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=robing, access=WRITE, inode="mapred":mapred:hadoop:rwxr-xr-x

So I changed the permissions on /mapred to 777:
hyperion-hdp4@mapred:hadoop fs -chmod 777 /mapred
hyperion-hdp4@mapred:hadoop fs -ls /
Found 3 items
drwxrwxrwx   - mapred hadoop          0 2012-10-09 12:58 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
hyperion-hdp4@mapred:

And then I run again and this time it works.
yperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen 
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
Generating 10000000000 using 2 maps with step of 5000000000
12/10/09 16:33:02 INFO mapred.JobClient: Running job: job_201210072045_0006
12/10/09 16:33:03 INFO mapred.JobClient:  map 0% reduce 0%
12/10/09 16:34:34 INFO mapred.JobClient:  map 1% reduce 0%
12/10/09 16:35:52 INFO mapred.JobClient:  map 2% reduce 0%
etc…

And indeed I can see that there is stuff written to /mapred under my userid:
# hyperion-hdp4 /root > hadoop fs -ls /mapred
Found 2 items
drwxrwxrwx   - robing hadoop          0 2012-10-09 16:33 
/mapred/job_201210072045_0006
-rw-------   2 mapred hadoop          4 2012-10-09 12:58 /mapred/jobtracker.info

However,man ally setting the permissions to 777 is not a workable solution 
since any time I restart the jobtracker, it is setting the permissions on 
/mapred back to 700.
 hyperion-hdp3 /root > hadoop fs -ls /
Found 3 items
drwxrwxrwx   - mapred hadoop          0 2012-10-09 16:33 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
# hyperion-hdp3 /root > /etc/init.d/hadoop-jobtracker restart
Stopping Hadoop jobtracker daemon (hadoop-jobtracker): stopping jobtracker
                                                           [  OK  ]
Starting Hadoop jobtracker daemon (hadoop-jobtracker): starting jobtracker, 
logging to /var/log/hadoop/mapred/hadoop-mapred-jobtracker-hyperion-hdp3.out
                                                           [  OK  ]
# hyperion-hdp3 /root > hadoop fs -ls /
Found 3 items
drwx------   - mapred hadoop          0 2012-10-09 16:38 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
# hyperion-hdp3 /root >

So my questions are:

  1.  What are the right permissions on mapred.system.dir?
  2.  If not 700, how do I get the job tracker to stop changing them to 700?
  3.  If 700 is correct, then what am I doing wrong in my attempt to run the 
example teragen program?

Thank you in advance.
Robin Goldstone, LLNL

Reply via email to