Hi there, I've got three node setup with hadoop-1.0.4, one as master and slave, the two as slaves. Running jps shows the datanode etc. are all running properly. Passwordless SSH from the master to other machine is also OK.
I tried to run the wordcount example, http://localhost:50030/machines.jsp?type=active shows that only the master was running tasks. Then I tried to run the pi estimater with arguments 10 100, this time other machines could run tasks, but it just couldn't reduce. The log in terminal looks like this 14/04/08 21:53:40 INFO mapred.FileInputFormat: Total input paths to process : 10 14/04/08 21:53:41 INFO mapred.JobClient: Running job: job_201404082136_0003 14/04/08 21:53:42 INFO mapred.JobClient: map 0% reduce 0% 14/04/08 21:53:55 INFO mapred.JobClient: map 60% reduce 0% 14/04/08 21:54:01 INFO mapred.JobClient: map 100% reduce 0% 14/04/08 21:54:04 INFO mapred.JobClient: map 100% reduce 6% 14/04/08 21:54:42 INFO mapred.JobClient: Task Id : attempt_201404082136_0003_m_000002_0, Status : FAILED Too many fetch-failures 14/04/08 21:54:42 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000002_0&filter=stdout 14/04/08 21:54:42 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000002_0&filter=stderr 14/04/08 21:54:46 INFO mapred.JobClient: map 89% reduce 6% 14/04/08 21:54:49 INFO mapred.JobClient: map 100% reduce 6% 14/04/08 21:55:41 INFO mapred.JobClient: Task Id : attempt_201404082136_0003_m_000003_0, Status : FAILED Too many fetch-failures 14/04/08 21:55:41 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000003_0&filter=stdout 14/04/08 21:55:41 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000003_0&filter=stderr 14/04/08 21:55:46 INFO mapred.JobClient: map 89% reduce 6% 14/04/08 21:55:49 INFO mapred.JobClient: map 100% reduce 6% 14/04/08 21:56:09 INFO mapred.JobClient: map 100% reduce 13% 14/04/08 21:56:44 INFO mapred.JobClient: Task Id : attempt_201404082136_0003_m_000008_0, Status : FAILED Too many fetch-failures 14/04/08 21:56:44 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000008_0&filter=stdout 14/04/08 21:56:44 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000008_0&filter=stderr 14/04/08 21:56:49 INFO mapred.JobClient: map 89% reduce 13% 14/04/08 21:56:52 INFO mapred.JobClient: map 100% reduce 13% 14/04/08 21:57:45 INFO mapred.JobClient: Task Id : attempt_201404082136_0003_m_000009_0, Status : FAILED Too many fetch-failures 14/04/08 21:57:45 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000009_0&filter=stdout 14/04/08 21:57:45 WARN mapred.JobClient: Error reading task outputhttp://ubuntu:50060/tasklog?plaintext=true&attemptid=attempt_201404082136_0003_m_000009_0&filter=stderr 14/04/08 21:57:50 INFO mapred.JobClient: map 89% reduce 13% 14/04/08 21:57:53 INFO mapred.JobClient: map 100% reduce 13% Due to scheduling issues we are using the machine named slave2 as master, slave3 and master as slaves. My configurations are core-site.xml: <configuration> <property> <name>fs.default.name</name> <value>hdfs://slave2:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> </configuration> hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>slave2:9001</value> </property> <property> <name>mapred.child.tmp</name> <value>/usr/local/hadoop/tmp</value> </property> </configuration> Any idea why the examples can't run properly? Thanks.