Hi Ashwanth, This property wasn't set so I added it, thank you. Now I am testing it with just one file and seems to be working alright.
What would be the best place to put the clustering code in? Right now, I have it in the map() but I am having thoughts that I should be placing it in the reduce() for it to be truly parallel. Please let me know what you think. Thanks very much, -Ahmed On Wed, Jan 18, 2012 at 9:00 PM, Ashwanth Kumar < ashwanthku...@googlemail.com> wrote: > Can you check what is the value of "mapred.task.timeout" > in hadoop-default.xml. If there is no such value try setting the property > as - > > <property> > <name>mapred.task.timeout</name> > <value>800000</value> > <description>The number of milliseconds before a task will be > terminated if it neither reads an input, writes an output, nor > updates its status string. > </description> > </property> > > Seems like a timeout error. > > - Ashwanth Kumar > > On Thu, Jan 19, 2012 at 3:00 AM, Ahmed Abdeen Hamed < > ahmed.elma...@gmail.com> wrote: > >> Hello friends, >> >> I am new to Apache MapReduce. I just wrote a program that processes 89 >> files, each of which is 10000 lines. The program runs a clustering >> algorithm of the contents of the 89 files. The job failed a couple of times >> before it was terminated without any results wrote to the output folder. >> >> Here is the logs printer on the console: >> >> 12/01/18 15:27:54 INFO input.FileInputFormat: Total input paths to >> process : 88 >> 12/01/18 15:27:59 INFO mapred.JobClient: Running job: >> job_201201121550_0055 >> 12/01/18 15:28:00 INFO mapred.JobClient: map 0% reduce 0% >> 12/01/18 15:28:19 INFO mapred.JobClient: map 1% reduce 0% >> 12/01/18 15:28:40 INFO mapred.JobClient: map 2% reduce 0% >> 12/01/18 15:32:46 INFO mapred.JobClient: map 3% reduce 0% >> 12/01/18 15:42:50 INFO mapred.JobClient: map 2% reduce 0% >> 12/01/18 15:42:52 INFO mapred.JobClient: Task Id : >> attempt_201201121550_0055_m_000002_0, Status : FAILED >> Task attempt_201201121550_0055_m_000002_0 failed to report status for 602 >> seconds. Killing! >> 12/01/18 15:43:02 INFO mapred.JobClient: map 3% reduce 0% >> 12/01/18 15:52:15 INFO mapred.JobClient: map 4% reduce 0% >> 12/01/18 16:02:18 INFO mapred.JobClient: map 3% reduce 0% >> 12/01/18 16:02:20 INFO mapred.JobClient: Task Id : >> attempt_201201121550_0055_m_000004_0, Status : FAILED >> Task attempt_201201121550_0055_m_000004_0 failed to report status for 602 >> seconds. Killing! >> 12/01/18 16:02:30 INFO mapred.JobClient: map 4% reduce 0% >> 12/01/18 16:02:42 INFO mapred.JobClient: map 5% reduce 0% >> >> I don't now familiar with the rules of this list, but I would share the >> code if this is required and OK to do. >> >> Please let me know if you have any answers for me. >> >> Thanks very much, >> -Ahmed >> >> >