Attempt directories are in <hadoop-tmp>/mapred/local I grep'd for one of the attempt that has left over in mapred/local in tasktracker logs: 09/04/27 21:07:19 INFO mapred.TaskTracker: LaunchTaskAction: attempt_200902120108_44218_r_000000_0 09/04/27 21:07:29 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:32 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:38 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:41 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:47 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:53 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:07:56 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:08:02 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:08:08 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:08:11 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:08:17 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.0% reduce > copy > 09/04/27 21:08:23 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:26 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:29 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:32 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:39 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:45 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:48 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:08:54 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:09:00 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.14285716% reduce > copy (6 of 14 at 2.03 MB/s) > 09/04/27 21:09:06 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.33333334% reduce > sort 09/04/27 21:09:09 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.33333334% reduce > sort 09/04/27 21:09:12 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.7029736% reduce > reduce 09/04/27 21:09:15 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.771893% reduce > reduce 09/04/27 21:09:18 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.8495109% reduce > reduce 09/04/27 21:09:21 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.9042134% reduce > reduce 09/04/27 21:09:24 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.98041093% reduce > reduce 09/04/27 21:09:26 INFO mapred.TaskTracker: attempt_200902120108_44218_r_000000_0 0.99415195% reduce > reduce 09/04/27 21:09:26 INFO mapred.TaskTracker: Task attempt_200902120108_44218_r_000000_0 is done. 09/04/27 21:09:31 INFO mapred.TaskRunner: attempt_200902120108_44218_r_000000_0 done; removing files.
Regards Sandhya On Tue, Apr 28, 2009 at 2:39 PM, Amareshwari Sriramadasu <amar...@yahoo-inc.com> wrote: > Again, where are you seeing the attemptid directories? are they at > mapred/local/<attemptid> or at > mapred/local/taskTracker/jobCache/<jobid>/<attempid>. > If you are seeing files at mapred/local/<attemptid>, then it is bug. Please > raise a jira and attach tasktracker logs if possible. > If not, mapred/local/taskTracker/jobCache/<jobid>/<attempid> directories are > cleaned up on a KillTaskAction and mapred/local/taskTracker/jobCache/<jobid> > directories are cleanedup on KillJobAction. Can you verify from TaskTracker > logs, the attemptid got a KillTaskAction or jobid got a KillJobAction? If > not, This is fixed by HADOOP-5247. > > Thanks > Amareshwari > > Sandhya E wrote: >> >> Hi Amareshwari >> >> We are on 0.18 version. I verified from jobtracker website that not >> all killed tasks have left overs in mapred/local. Also there are some >> tasks that were successful have left their tmp folders in mapred/local >> >> Can you please give some pointers on how to debug it further. >> >> Regards >> Sandhya >> >> On Tue, Apr 28, 2009 at 2:02 PM, Amareshwari Sriramadasu >> <amar...@yahoo-inc.com> wrote: >> >>> >>> Hi Sandhya, >>> >>> Which version of HADOOP are you using? There could be <attempt_id> >>> directories in mapred/local, pre 0.17. Now, there should not be any such >>> directories. >>> From version 0.17 onwards, the attempt directories will be present only >>> at >>> mapred/local/taskTracker/jobCache/<jobid>/<attempid> . If you are seeing >>> the >>> directories in any other location, then it seems like a bug. >>> >>> HADOOP-4654 is to cleanup temporary data in DFS for failed tasks, it does >>> not change local FileSystem files. >>> >>> Thanks >>> Amareshwari >>> Edward J. Yoon wrote: >>> >>>> >>>> Hi, >>>> >>>> It seems related with https://issues.apache.org/jira/browse/HADOOP-4654. >>>> >>>> On Tue, Apr 28, 2009 at 4:01 PM, Sandhya E <sandhyabhas...@gmail.com> >>>> wrote: >>>> >>>> >>>>> >>>>> Hi >>>>> >>>>> Under <hadoop-tmp-dir>/mapred/local there are directories like >>>>> "attempt_200904262046_0026_m_000002_0" >>>>> Each of these directories contains files of format: intermediate.1 >>>>> intermediate.2 intermediate.3 intermediate.4 intermediate.5 >>>>> There are many directories in this format. All these correspond to >>>>> killed task attempts. As they contain huge intermediate files, we >>>>> landed up in disk space issues. >>>>> >>>>> They are cleaned up when mapred cluster is restarted. But otherwise, >>>>> how can these be cleaned up without having to restart cluster. >>>>> >>>>> Conf parameter "keep.failed.task.files" is set to "false" in our case. >>>>> >>>>> Many Thanks >>>>> Sandhya >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> > >