Looks like the reduce task is not able to fetch the map output from the
other machine. My guess is that the reduce task is able to pull data
from the same machine making the progress upto 16% but fails to get the
data from the other machine. This could be a firewall issue. Is it
possible for you to post the reduce task's logs and also the tasktracker
where the reducer failed. The reducer failed trying to fetch the map
data from the remote machine. This data is represented by an URL. Try
fetching it manually from the reducer's machine and let us know what
happens.
Amar
chanel wrote:
Hey everyone,
I'm trying to get the hang of using Hadoop and I'm using the Michael
Noll Ubuntu tutorials
(http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)).
Using the wordcount example that comes with version 0.17.1-dev I get
this error output:
08/06/14 15:17:45 INFO mapred.FileInputFormat: Total input paths to
process : 6
08/06/14 15:17:46 INFO mapred.JobClient: Running job:
job_200806141506_0003
08/06/14 15:17:47 INFO mapred.JobClient: map 0% reduce 0%
08/06/14 15:17:53 INFO mapred.JobClient: map 12% reduce 0%
08/06/14 15:17:54 INFO mapred.JobClient: map 25% reduce 0%
08/06/14 15:17:55 INFO mapred.JobClient: map 37% reduce 0%
08/06/14 15:17:57 INFO mapred.JobClient: map 50% reduce 0%
08/06/14 15:17:58 INFO mapred.JobClient: map 75% reduce 0%
08/06/14 15:18:00 INFO mapred.JobClient: map 100% reduce 0%
08/06/14 15:18:03 INFO mapred.JobClient: map 100% reduce 1%
08/06/14 15:18:09 INFO mapred.JobClient: map 100% reduce 13%
08/06/14 15:18:16 INFO mapred.JobClient: map 100% reduce 18%
08/06/14 15:20:49 INFO mapred.JobClient: Task Id :
task_200806141506_0003_m_000001_0, Status : FAILED
Too many fetch-failures
08/06/14 15:20:51 INFO mapred.JobClient: map 87% reduce 18%
08/06/14 15:20:52 INFO mapred.JobClient: map 100% reduce 18%
08/06/14 15:20:56 INFO mapred.JobClient: map 100% reduce 19%
08/06/14 15:21:01 INFO mapred.JobClient: map 100% reduce 20%
08/06/14 15:21:05 INFO mapred.JobClient: map 100% reduce 16%
08/06/14 15:21:05 INFO mapred.JobClient: Task Id :
task_200806141506_0003_r_000001_0, Status : FAILED
Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
This is with 2 nodes (master and slave) using the default values in
/hadoop/conf/hadoop-default.xml and then increasing the number of
reduce tasks to 3 and 5 to see if this changed anything (which it
didn't). I'm wondering if anybody had this type of problem before and
how to fix it? Thanks for any help.
-Chanel