Re: Another newbie - problem with grep example

2011-12-29 Thread patf


I got hadoop 0.22.0 running with Windows.  The most useful instructions 
I found where these:


http://knowlspace.wordpress.com/2011/06/21/setting-up-hadoop-on-windows/

I was able to run examples grep, pi and WordCount.

Saw that 1.0 just came out.  Downloaded it; installed; and tried it 
out.  Don't get so far as even the first example.  tasktracker won't run:


2011-12-29 14:41:27,798 ERROR org.apache.hadoop.mapred.TaskTracker: 
Can not start task tracker because java.io.IOException: Failed to set 
permissions of path: \tmp\hadoop-cyg_server\mapred\local\ttprivate to 0700


The above may be this bug:  
https://issues.apache.org/jira/browse/HADOOP-7682


I can see that it has something to do with permissions between my user 
id and the account cyg_server that seems part of the standard way to get 
sshd running on Windows.


The most obvious thing that's confusing is that, with this bug, why did 
I get as far as I did with 0.22.0?


Went back to 0.22.0 and it still runs fine.

People claim that computers are deterministic machines - which is not 
true.  We get them to work well enough for a great many applications - 
some even life-critical.  But in fact computers are complex systems 
where minute problems can cascade into complete failures of the system.


Pat

On 12/24/2011 3:56 PM, patf wrote:


Thanks Prashant.

That's was yesterday on Linux at work.  It's Sat; I'm at home; and 
trying out hadoop on Windows 7.  The hadoop Windows isntall was 
something of a pain-in-the-ass but I've got hadoop basically working 
(but not computing yet).


Neither the grep nor the pi examples works.  Here's the error I see in 
my Windows installation (Windows 7 with a new cygwin install).


c:/PROGRA~2/Java/jdk1.6.0_25/bin/java -Xmx1000m 
-Dhadoop.log.dir=D:\downloads\hadoop\logs 
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=D:\downloads\hadoop\ 
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,console 
-Dhadoop.security.logger=INFO,console 
-Djava.library.path=/cygdrive/d/downloads/hadoop/lib/native/Windows_7-x86-32 
-Dhadoop.policy.file=hadoop-policy.xml 
-Djava.net.preferIPv4Stack=true org.apache.hadoop.util.RunJar 
hadoop-mapred-examples-0.22.0.jar grep input output dfs[a-z.]+
11/12/24 15:03:27 WARN conf.Configuration: 
mapred.used.genericoptionsparser is deprecated. Instead, use 
mapreduce.client.genericoptionsparser.used
11/12/24 15:03:27 WARN mapreduce.JobSubmitter: No job jar file set.  
User classes may not be found. See Job or Job#setJar(String).
11/12/24 15:03:27 INFO input.FileInputFormat: Total input paths to 
process : 19

11/12/24 15:03:28 INFO mapreduce.JobSubmitter: number of splits:19
11/12/24 15:03:29 INFO mapreduce.Job: Running job: job_201112241459_0003
11/12/24 15:03:30 INFO mapreduce.Job:  map 0% reduce 0%
11/12/24 15:03:33 INFO mapreduce.Job: Task Id : 
attempt_201112241459_0003_m_20_0, Status : FAILED

Error initializing attempt_201112241459_0003_m_20_0:
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=cyg_server, access=EXECUTE, 
inode=system:MyId:supergroup:rwx--
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:161)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:128)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4465)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4442)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2117)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:1022)

at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth
11/12/24 15:03:33 WARN mapreduce.Job: Error reading task 
outputhttp://MyId-PC:50060/tasklog?plaintext=trueattemptid=attempt_201112241459_0003_m_20_0filter=stdout
user cyg_server?  That's because I'm coming in under cygwin ssh.  
Looks like I need either change the privs in the hadoop file system or 
I need to extend more privs to cyg_server.  Although there may be an 
error (WARN) before I get to the cyg_server permissions crash?


Pat


On 12/23/2011 5:21 PM, Prashant Kommireddi wrote:

Seems like you do not have /user/MyId/input/conf on HDFS.

Try this.

cd $HADOOP_HOME_DIR (this should be your hadoop root dir)
hadoop fs -put conf input/conf

And then run the MR job again.

-Prashant Kommireddi

On Fri, Dec 23, 2011 at 3:40 PM, Pat Flahertyp...@well.com  wrote:


Hi,

Installed 0.22.0 on CentOS 5.7.  I can start dfs and mapred and see 
their

processes.

Ran the first grep example: bin/hadoop jar hadoop-*-examples.jar grep
input output 'dfs[a-z.]+'.  It seems the correct jar name is
hadoop-mapred-examples-0.22.0.jar

Re: Another newbie - problem with grep example

2011-12-24 Thread patf


Thanks Prashant.

That's was yesterday on Linux at work.  It's Sat; I'm at home; and 
trying out hadoop on Windows 7.  The hadoop Windows isntall was 
something of a pain-in-the-ass but I've got hadoop basically working 
(but not computing yet).


Neither the grep nor the pi examples works.  Here's the error I see in 
my Windows installation (Windows 7 with a new cygwin install).


c:/PROGRA~2/Java/jdk1.6.0_25/bin/java -Xmx1000m 
-Dhadoop.log.dir=D:\downloads\hadoop\logs -Dhadoop.log.file=hadoop.log 
-Dhadoop.home.dir=D:\downloads\hadoop\ -Dhadoop.id.str= 
-Dhadoop.root.logger=INFO,console 
-Dhadoop.security.logger=INFO,console 
-Djava.library.path=/cygdrive/d/downloads/hadoop/lib/native/Windows_7-x86-32 
-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true 
org.apache.hadoop.util.RunJar hadoop-mapred-examples-0.22.0.jar grep 
input output dfs[a-z.]+
11/12/24 15:03:27 WARN conf.Configuration: 
mapred.used.genericoptionsparser is deprecated. Instead, use 
mapreduce.client.genericoptionsparser.used
11/12/24 15:03:27 WARN mapreduce.JobSubmitter: No job jar file set.  
User classes may not be found. See Job or Job#setJar(String).
11/12/24 15:03:27 INFO input.FileInputFormat: Total input paths to 
process : 19

11/12/24 15:03:28 INFO mapreduce.JobSubmitter: number of splits:19
11/12/24 15:03:29 INFO mapreduce.Job: Running job: job_201112241459_0003
11/12/24 15:03:30 INFO mapreduce.Job:  map 0% reduce 0%
11/12/24 15:03:33 INFO mapreduce.Job: Task Id : 
attempt_201112241459_0003_m_20_0, Status : FAILED

Error initializing attempt_201112241459_0003_m_20_0:
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=cyg_server, access=EXECUTE, inode=system:MyId:supergroup:rwx--
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:161)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:128)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4465)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4442)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2117)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:1022)

at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth
11/12/24 15:03:33 WARN mapreduce.Job: Error reading task 
outputhttp://MyId-PC:50060/tasklog?plaintext=trueattemptid=attempt_201112241459_0003_m_20_0filter=stdout
user cyg_server?  That's because I'm coming in under cygwin ssh.  Looks 
like I need either change the privs in the hadoop file system or I need 
to extend more privs to cyg_server.  Although there may be an error 
(WARN) before I get to the cyg_server permissions crash?


Pat


On 12/23/2011 5:21 PM, Prashant Kommireddi wrote:

Seems like you do not have /user/MyId/input/conf on HDFS.

Try this.

cd $HADOOP_HOME_DIR (this should be your hadoop root dir)
hadoop fs -put conf input/conf

And then run the MR job again.

-Prashant Kommireddi

On Fri, Dec 23, 2011 at 3:40 PM, Pat Flahertyp...@well.com  wrote:


Hi,

Installed 0.22.0 on CentOS 5.7.  I can start dfs and mapred and see their
processes.

Ran the first grep example: bin/hadoop jar hadoop-*-examples.jar grep
input output 'dfs[a-z.]+'.  It seems the correct jar name is
hadoop-mapred-examples-0.22.0.jar - there are no other hadoop*examples*.jar
files in HADOOP_HOME.

Didn't work.  Then found and tried pi (compute pi) - that works, so my
installation is to some degree of approximation good.

Back to grep.  It fails with


java.io.FileNotFoundException: File does not exist: /user/MyId/input/conf

Found and ran bin/hadoop fs -ls.  OK these directory names are internal to
hadoop (I assume) because Linux has no idea of /user.

And the directory is there - but the program is failing.

Any suggestions; where to start; etc?

Thanks - Pat