Re: Mappers getting killed

Brock Noland Thu, 27 Oct 2011 06:20:05 -0700

Hi,

On Thu, Oct 27, 2011 at 3:22 AM, Arko Provo Mukherjee
<arkoprovomukher...@gmail.com> wrote:
> Hi,
>
> I have a situation where I have to read a large file into every mapper.
>
> Since its a large HDFS file that is needed to work on each input to the
> mapper, it is taking a lot of time to read the data into the memory from
> HDFS.
>
> Thus the system is killing all my Mappers with the following message:
>
> 11/10/26 22:54:52 INFO mapred.JobClient: Task Id :
> attempt_201106271322_12504_m_000000_0, Status : FAILED
> Task attempt_201106271322_12504_m_000000_0 failed to report status for 601
> seconds. Killing!
>
> The cluster is not entirely owned by me and hence I cannot change
> the mapred.task.timeout so as to be able to read the entire file.
> Any suggestions?
> Also, is there a way such that a Mapper instance reads the file once for all
> the inputs that it receives.
> Currently, since the file reading code is in the map method, I guess its
> reading the entire file for each and every input leading to a lot of
> overhead.



The file should be read in, in the configure() (old api) or setup()
(new api) method.

Brock

Re: Mappers getting killed

Reply via email to