Re: intermediate files of killed tasks not purged

Sandhya E Tue, 28 Apr 2009 01:49:06 -0700

Hi Amareshwari

We are on 0.18 version. I verified from jobtracker website that not
all killed tasks have left overs in mapred/local.  Also there are some
tasks that were successful have left their tmp folders in mapred/local


Can you please give some pointers on how to debug it further.

Regards
Sandhya

On Tue, Apr 28, 2009 at 2:02 PM, Amareshwari Sriramadasu
<amar...@yahoo-inc.com> wrote:
> Hi Sandhya,
>
>  Which version of HADOOP are you using? There could be <attempt_id>
> directories in mapred/local, pre 0.17. Now, there should not be any such
> directories.
> From version 0.17 onwards, the attempt directories will be present only at
> mapred/local/taskTracker/jobCache/<jobid>/<attempid> . If you are seeing the
> directories in any other location, then it seems like a bug.
>
> HADOOP-4654 is to cleanup temporary data in DFS for failed tasks, it does
> not change local FileSystem files.
>
> Thanks
> Amareshwari
> Edward J. Yoon wrote:
>>
>> Hi,
>>
>> It seems related with https://issues.apache.org/jira/browse/HADOOP-4654.
>>
>> On Tue, Apr 28, 2009 at 4:01 PM, Sandhya E <sandhyabhas...@gmail.com>
>> wrote:
>>
>>>
>>> Hi
>>>
>>> Under <hadoop-tmp-dir>/mapred/local there are directories like
>>> "attempt_200904262046_0026_m_000002_0"
>>> Each of these directories contains files of format: intermediate.1
>>> intermediate.2  intermediate.3  intermediate.4  intermediate.5
>>> There are many directories in this format. All these correspond to
>>> killed task attempts. As they contain huge intermediate files, we
>>> landed up in disk space issues.
>>>
>>> They are cleaned up  when mapred cluster is restarted. But otherwise,
>>> how can these be cleaned up without having to restart cluster.
>>>
>>> Conf parameter "keep.failed.task.files" is set to "false" in our case.
>>>
>>> Many Thanks
>>> Sandhya
>>>
>>>
>>
>>
>>
>>
>
>

Re: intermediate files of killed tasks not purged

Reply via email to