I got hadoop 0.22.0 running with Windows. The most useful instructions I found where these:

http://knowlspace.wordpress.com/2011/06/21/setting-up-hadoop-on-windows/

I was able to run examples grep, pi and WordCount.

Saw that 1.0 just came out. Downloaded it; installed; and tried it out. Don't get so far as even the first example. tasktracker won't run:

2011-12-29 14:41:27,798 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: \tmp\hadoop-cyg_server\mapred\local\ttprivate to 0700

The above may be this bug: https://issues.apache.org/jira/browse/HADOOP-7682

I can see that it has something to do with permissions between my user id and the account cyg_server that seems part of the standard way to get sshd running on Windows.

The most obvious thing that's confusing is that, with this bug, why did I get as far as I did with 0.22.0?

Went back to 0.22.0 and it still runs fine.

People claim that computers are deterministic machines - which is not true. We get them to work well enough for a great many applications - some even life-critical. But in fact computers are complex systems where minute problems can cascade into complete failures of the system.

Pat

On 12/24/2011 3:56 PM, patf wrote:

Thanks Prashant.

That's was yesterday on Linux at work. It's Sat; I'm at home; and trying out hadoop on Windows 7. The hadoop Windows isntall was something of a pain-in-the-ass but I've got hadoop basically working (but not computing yet).

Neither the grep nor the pi examples works. Here's the error I see in my Windows installation (Windows 7 with a new cygwin install).

c:/PROGRA~2/Java/jdk1.6.0_25/bin/java -Xmx1000m -Dhadoop.log.dir=D:\downloads\hadoop\logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=D:\downloads\hadoop\ -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Dhadoop.security.logger=INFO,console -Djava.library.path=/cygdrive/d/downloads/hadoop/lib/native/Windows_7-x86-32 -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true org.apache.hadoop.util.RunJar hadoop-mapred-examples-0.22.0.jar grep input output dfs[a-z.]+ 11/12/24 15:03:27 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 11/12/24 15:03:27 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 11/12/24 15:03:27 INFO input.FileInputFormat: Total input paths to process : 19
11/12/24 15:03:28 INFO mapreduce.JobSubmitter: number of splits:19
11/12/24 15:03:29 INFO mapreduce.Job: Running job: job_201112241459_0003
11/12/24 15:03:30 INFO mapreduce.Job:  map 0% reduce 0%
11/12/24 15:03:33 INFO mapreduce.Job: Task Id : attempt_201112241459_0003_m_000020_0, Status : FAILED
Error initializing attempt_201112241459_0003_m_000020_0:
org.apache.hadoop.security.AccessControlException: Permission denied: user=cyg_server, access=EXECUTE, inode="system":MyId:supergroup:rwx------ at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:161) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:128) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4465) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4442) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2117) at org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:1022)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth 11/12/24 15:03:33 WARN mapreduce.Job: Error reading task outputhttp://MyId-PC:50060/tasklog?plaintext=true&attemptid=attempt_201112241459_0003_m_000020_0&filter=stdout
user cyg_server? That's because I'm coming in under cygwin ssh. Looks like I need either change the privs in the hadoop file system or I need to extend more privs to cyg_server. Although there may be an error (WARN) before I get to the cyg_server permissions crash?

Pat


On 12/23/2011 5:21 PM, Prashant Kommireddi wrote:
Seems like you do not have "/user/MyId/input/conf" on HDFS.

Try this.

cd $HADOOP_HOME_DIR (this should be your hadoop root dir)
hadoop fs -put conf input/conf

And then run the MR job again.

-Prashant Kommireddi

On Fri, Dec 23, 2011 at 3:40 PM, Pat Flaherty<p...@well.com>  wrote:

Hi,

Installed 0.22.0 on CentOS 5.7. I can start dfs and mapred and see their
processes.

Ran the first grep example: bin/hadoop jar hadoop-*-examples.jar grep
input output 'dfs[a-z.]+'.  It seems the correct jar name is
hadoop-mapred-examples-0.22.0.jar - there are no other hadoop*examples*.jar
files in HADOOP_HOME.

Didn't work.  Then found and tried pi (compute pi) - that works, so my
installation is to some degree of approximation good.

Back to grep.  It fails with

java.io.FileNotFoundException: File does not exist: /user/MyId/input/conf
Found and ran bin/hadoop fs -ls. OK these directory names are internal to
hadoop (I assume) because Linux has no idea of /user.

And the directory is there - but the program is failing.

Any suggestions; where to start; etc?

Thanks - Pat



Reply via email to