[
https://issues.apache.org/jira/browse/MAPREDUCE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217228#comment-13217228
]
Amar Kamat commented on MAPREDUCE-3926:
---
Mitesh,
I guess adding this to 0.20.205 might involve a lot of change. Also, the JT has
no information about the running tasks i.e they could in fact be RUNNING,
KILLED, FAILED, PENDING etc.
Note that this can happen for SUCCESSFUL jobs too. The job can still
complete/finish while the speculative tasks are running. In such cases, there
is no information about the speculative tasks logged in the job history.
This can surely be fixed in trunk.
No information of unfinished map task in Job History, if all attempts of
another map task fail.
---
Key: MAPREDUCE-3926
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3926
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobtracker
Affects Versions: 0.20.205.0
Reporter: Mitesh Singh Jat
Priority: Minor
No information of unfinished map task in Job History, if all attempts of
another map task fail.
For example,
1. The first map task's first attempt m_00_0 was making progress
2. The second map task failed 4 times, before completion of first map task
attempt.
3. Hence, a job cleanup task was launched and completed, before completion of
first map task attempt.
4. After job cleanup task, runningMapCache is cleaned
{noformat}
completedTask() - jobComplete() - garbageCollect() - this.runningMapCache
= null;
|- retireMap() - if (runningMapCache == null) Running cache
for maps missing!! Job details are missing.
{noformat}
5. Hence, Running cache for maps missing!! Job details are missing. error
comes
(from retireMap() which is called after jobComplete() ) and no information is
added further to Job History. Therefore, first map task's information is
missing from Job History page.
I have created a sample streaming MR job, to reproduce this issue.
{code:title=mapper.sh}
#!/bin/bash
read line
if [[ $line == sleep ]]
then
for i in 1 2 3
do
echo Sleeping 2
sleep 5
done
exit 0
else
echo Exiting 2
exit -1
fi
{code}
Input file: in1.txt is for long running map task (here first map task)
{code:title=/user/mitesh/input/in1.txt}
sleep
{code}
Input file: in2.txt is for failing map task (here second map task)
{code:title=/user/mitesh/input/in2.txt}
exit
{code}
Running the sample streaming MR job.
{noformat}
$ hadoop fs -rmr -skipTrash xyz
$ hadoop fs -jar $HADOOP_INSTALL/hadoop-streaming.jar
-Dmapred.map.max.attempts=2 -Dmapred.min.split.size=7 -Dmapred.map.tasks=2
-mapper mapper.sh -file mapper.sh -reducer NONE -input
/user/mitesh/input/in1.txt -input /user/mitesh/input/in2.txt -output xyz
{noformat}
Job History web UI
{noformat}
Hadoop Job job_201201310454_542302 on History Viewer
User: mitesh
JobName: streamjob7439640883203077520.jar
JobConf: hdfs://nn:port/user/mitesh/.staging/job_201201310454_542302/job.xml
Job-ACLs:
mapreduce.job.acl-view-job: No users are allowed
mapreduce.job.acl-modify-job: No users are allowed
Submitted At: 27-Feb-2012 12:56:02
Launched At: 27-Feb-2012 12:56:11 (8sec)
Finished At: 27-Feb-2012 12:56:31 (20sec)
Status: FAILED
Failure Info: # of failed Map Tasks exceeded allowed limit. FailedCount: 1.
LastFailedTask: task_201201310454_542302_m_01
Analyse This Job
Kind Total Tasks(successful+failed+killed) Successful tasksFailed
tasksKilled tasksStart Time Finish Time
Setup 1 1 0 0 27-Feb-2012 12:56:12
27-Feb-2012 12:56:16 (4sec)
Map 2 0 2 0 27-Feb-2012 12:56:1627-Feb-2012
12:56:26 (10sec)
Reduce0 0 0 0
Cleanup 1 1 0 0 27-Feb-2012 12:56:26
27-Feb-2012 12:56:31 (4sec)
{noformat}
Above it shows, only 2 failed tasks (belong to second map task).
Only from JT logs, the task tracker of first map task can be found.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira