How many maximum mappers and reducers did you configure?
It seems your TaskRunner fails to get response.
Maybe you need to try increasing "mapred.job.tracker.handler.count".
2008/10/22, Zhou, Yunqing <[EMAIL PROTECTED]>:
> Recently the tasks on our cluster random failed (both map tasks and reduce
Recently the tasks on our cluster random failed (both map tasks and reduce
tasks) . When rerun them, they are all ok.
The whole job is a IO-bound job. (250G input and 500G output(map) and
10G(final))
from the jobtracker, I can see the failed job says:
task_200810220830_0004_m_000653_0
tip_2008