El 7/7/2011 8:43 PM, Kai Ju Liu escribió:
Over the past week or two, I've run into an issue where MapReduce jobs
hang or fail near completion. The percent completion of both map and
reduce tasks is often reported as 100%, but the actual number of
completed tasks is less than the total number. It appears that either
tasks backtrack and need to be restarted or the last few reduce tasks
hang interminably on the copy step.

In certain cases, the jobs actually complete. In other cases, I can't
wait long enough and have to kill the job manually.

My Hadoop cluster is hosted in EC2 on instances of type c1.xlarge with 4
attached EBS volumes. The instances run Ubuntu 10.04.1 with the
2.6.32-309-ec2 kernel, and I'm currently using Cloudera's CDH3u0
distribution. Has anyone experienced similar behavior in their clusters,
and if so, had any luck resolving it? Thanks!

Can you post here your NN and DN logs files?
Regards

Kai Ju

--
Marcos Luís Ortíz Valmaseda
 Software Engineer (UCI)
 Linux User # 418229
 http://marcosluis2186.posterous.com
 http://twitter.com/marcosluis2186

Reply via email to