Re: Mappers getting killed

Arko Provo Mukherjee Fri, 28 Oct 2011 01:26:52 -0700

Thanks!

I will try and let know.


Warm regards
Arko

On Oct 27, 2011, at 8:19 AM, Brock Noland <br...@cloudera.com> wrote:

> Hi,
> 
> On Thu, Oct 27, 2011 at 3:22 AM, Arko Provo Mukherjee
> <arkoprovomukher...@gmail.com> wrote:
>> Hi,
>> 
>> I have a situation where I have to read a large file into every mapper.
>> 
>> Since its a large HDFS file that is needed to work on each input to the
>> mapper, it is taking a lot of time to read the data into the memory from
>> HDFS.
>> 
>> Thus the system is killing all my Mappers with the following message:
>> 
>> 11/10/26 22:54:52 INFO mapred.JobClient: Task Id :
>> attempt_201106271322_12504_m_000000_0, Status : FAILED
>> Task attempt_201106271322_12504_m_000000_0 failed to report status for 601
>> seconds. Killing!
>> 
>> The cluster is not entirely owned by me and hence I cannot change
>> the mapred.task.timeout so as to be able to read the entire file.
>> Any suggestions?
>> Also, is there a way such that a Mapper instance reads the file once for all
>> the inputs that it receives.
>> Currently, since the file reading code is in the map method, I guess its
>> reading the entire file for each and every input leading to a lot of
>> overhead.
> 
> 
> The file should be read in, in the configure() (old api) or setup()
> (new api) method.
> 
> Brock

Re: Mappers getting killed

Reply via email to