I have encountered the same problem when running the MapReduce code as a different user name. This issue was brought up in the core-dev mailing list, but I didn't see any work around or solution. Therefore, I would like to bring up this topic again to gain some input.
Sorry for cross posting, but not sure if this also belongs to the core-user mailing list. For example, assuming I have installed Hadoop with an account 'hadoop' and I am going to run my program with user account 'test'. I have created an input folder as /user/test/input/ with user 'test' and the permission is set to 0775. /user/test/input <dir> 2008-02-27 01:20 rwxr-xr-x test hadoop When I run the MapReduce code, the output I specified will be set to user 'hadoop' instead of 'test'. ${HADOOP_HOME}/bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" "/user/test/output/" The directory "/user/test/output/" will have the following permission and user:group. /user/test/output <dir> 2008-02-27 03:53 rwxr-xr-x hadoop hadoop My question will be - Why is the output folder set to the super user 'hadoop' ? and of course, the MapReduce code cannot access this folder because the permission does not allow user 'test' to write to this folder. So the output folder was created, but the user account 'test' cannot write anything to this folder and therefore threw an exception. See the following for the exception. I have been looking for solution to solve this, but cannot find an exact answer. How do I set the default umask to 0775? I can add the user 'test' to group 'hadoop' so the user 'test' can have write access to the folder within 'hadoop' group. In other word, as long as the folder is set to 'rwxrwxr-x', user 'test' can read/write to the folder and share the folder with 'hadoop:hadoop'. Any idea how I can set or modify the global default umask for Hadoop? or do I have to always override the default umask value in my configuration or FileSystem? ======= COPY/PASTE STARTS HERE ======= org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.fs.permission.AccessControlException: Permission denied: user=test, access=WRITE, inode="_task_200802262256_0007_r_000001_1":hadoop:hadoop:rwxr-xr-x at org.apache.hadoop.dfs.PermissionChecker.check( PermissionChecker.java:173) at org.apache.hadoop.dfs.PermissionChecker.check( PermissionChecker.java:154) at org.apache.hadoop.dfs.PermissionChecker.checkPermission( PermissionChecker.java:102) at org.apache.hadoop.dfs.FSNamesystem.checkPermission( FSNamesystem.java:4035) at org.apache.hadoop.dfs.FSNamesystem.checkAncestorAccess( FSNamesystem.java:4005) at org.apache.hadoop.dfs.FSNamesystem.startFileInternal( FSNamesystem.java:963) at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java :938) at org.apache.hadoop.dfs.NameNode.create(NameNode.java:281) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:409) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:899) at org.apache.hadoop.ipc.Client.call(Client.java:512) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198) at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod( RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke( RetryInvocationHandler.java:59) at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.<init>( DFSClient.java:1927) at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:382) at org.apache.hadoop.dfs.DistributedFileSystem.create( DistributedFileSystem.java:135) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:436) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:336) at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter( TextOutputFormat.java:116) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:308) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :2089) ======= COPY/PASTE ENDS HERE =======