Thanks Jeremy. I tried with your first suggestion and the mappers ran into completion. But then the reducers failed with another exception related to pipes. I believe it may be due to permission issues again. I tried setting a few additional config parameters but it didn't do the job. Please find the command used and the error logs from jobtracker web UI
hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar -D hadoop.tmp.dir=/home/streaming/tmp/hadoop/ -D dfs.data.dir=/home/streaming/tmp -D mapred.local.dir=/home/streaming/tmp/local -D mapred.system.dir=/home/streaming/tmp/system -D mapred.temp.dir=/home/streaming/tmp/temp -input /userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output -mapper /home/streaming/WcStreamMap.py -reducer /home/streaming/WcStreamReduce.py java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:478) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) The folder permissions at the time of job execution are as follows cloudera@cloudera-vm:~$ ls -l /home/streaming/ drwxrwxrwx 5 root root 4096 2011-09-12 05:59 tmp -rwxrwxrwx 1 root root 707 2011-09-11 23:42 WcStreamMap.py -rwxrwxrwx 1 root root 1077 2011-09-11 23:42 WcStreamReduce.py cloudera@cloudera-vm:~$ ls -l /home/streaming/tmp/ drwxrwxrwx 2 root root 4096 2011-09-12 06:12 hadoop drwxrwxrwx 2 root root 4096 2011-09-12 05:58 local drwxrwxrwx 2 root root 4096 2011-09-12 05:59 system drwxrwxrwx 2 root root 4096 2011-09-12 05:59 temp Am I missing some thing here? It is not for long I'm into Linux so couldn't try your second suggestion on setting up the Linux task controller. Thanks a lot Regards Bejoy.K.S On Mon, Sep 12, 2011 at 6:20 AM, Jeremy Lewi <jer...@lewi.us> wrote: > I would suggest you try putting your mapper/reducer py files in a directory > that is world readable at every level . i.e /tmp/test. I had similar > problems when I was using streaming and I believe my workaround was to put > the mapper/reducers outside my home directory. The other more involved > alternative is to setup the linux task controller so you can run your MR > jobs as the user who submits the jobs. > > J > > > On Mon, Sep 12, 2011 at 2:18 AM, Bejoy KS <bejoy.had...@gmail.com> wrote: > >> Hi >> I wanted to try out hadoop steaming and got the sample python code >> for mapper and reducer. I copied both into my lfs and tried running the >> steaming job as mention in the documentation. >> Here the command i used to run the job >> >> hadoop jar >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar >> -input /userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output >> -mapper /home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py -reducer >> /home/cloudera/bejoy/apps/inputs/wc/WcStreamReduce.py >> >> Here other than input and output the rest all are on lfs locations. How >> ever the job is failing. The error log from the jobtracker url is as >> >> java.lang.RuntimeException: Error in configuring object >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) >> at >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) >> at >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) >> at org.apache.hadoop.mapred.Child.main(Child.java:262) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) >> ... 9 more >> Caused by: java.lang.RuntimeException: Error in configuring object >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) >> at >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) >> at >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) >> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) >> ... 14 more >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) >> ... 17 more >> Caused by: java.lang.RuntimeException: configuration exception >> at >> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230) >> at >> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) >> ... 22 more >> Caused by: java.io.IOException: Cannot run program >> "/home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py": java.io.IOException: >> error=13, Permission denied >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) >> at >> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214) >> ... 23 more >> Caused by: java.io.IOException: java.io.IOException: error=13, Permission >> denied >> at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) >> at java.lang.ProcessImpl.start(ProcessImpl.java:65) >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) >> ... 24 more >> >> On the error I checked the permissions of mapper and reducer. Issued a >> chmod 777 command as well. Still no luck. >> >> The permission of the files are as follows >> cloudera@cloudera-vm:~$ ls -l /home/cloudera/bejoy/apps/inputs/wc/ >> -rwxrwxrwx 1 cloudera cloudera 707 2011-09-11 23:42 WcStreamMap.py >> -rwxrwxrwx 1 cloudera cloudera 1077 2011-09-11 23:42 WcStreamReduce.py >> >> I'm testing the same on Cloudera Demo VM. So the hadoop setup would be on >> pseudo distributed mode. Any help would be highly appreciated. >> >> Thank You >> >> Regards >> Bejoy.K.S >> >> >