[jira] [Commented] (MAPREDUCE-3926) No information of unfinished map task in Job History, if all attempts of another map task fail.

2012-02-28 Thread Mitesh Singh Jat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218050#comment-13218050
 ] 

Mitesh Singh Jat commented on MAPREDUCE-3926:
-

Hi Amar,

In Hadoop 0.23, in the Job History file, the task tracker information of 
unfinished map task is stored in TASK_STARTED and MAP_ATTEMPT_STARTED.

Regards,
Mitesh

 No information of unfinished map task in Job History, if all attempts of 
 another map task fail.
 ---

 Key: MAPREDUCE-3926
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3926
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0
Reporter: Mitesh Singh Jat
Priority: Minor

 No information of unfinished map task in Job History, if all attempts of 
 another map task fail.
 For example, 
 1. The first map task's first attempt m_00_0 was making progress
 2. The second map task failed 4 times, before completion of first map task 
 attempt.
 3. Hence, a job cleanup task was launched and completed, before completion of 
 first map task attempt.
 4. After job cleanup task, runningMapCache is cleaned
 {noformat}
 completedTask() - jobComplete() - garbageCollect() -  this.runningMapCache 
 = null;
|- retireMap() - if (runningMapCache == null) Running cache 
 for maps missing!! Job details are missing.
 {noformat}
 5. Hence, Running cache for maps missing!! Job details are missing. error 
 comes
 (from retireMap() which is called after jobComplete() ) and no information is
 added further to Job History. Therefore, first map task's information is
 missing from Job History page.
 I have created a sample streaming MR job, to reproduce this issue.
 {code:title=mapper.sh}
 #!/bin/bash
 read line
 if [[ $line == sleep ]]
 then
 for i in 1 2 3
 do
 echo Sleeping 2
 sleep 5
 done
 exit 0
 else
 echo Exiting 2
 exit -1
 fi
 {code}
 Input file: in1.txt is for long running map task (here first map task)
 {code:title=/user/mitesh/input/in1.txt}
 sleep
 {code}
 Input file: in2.txt is for failing map task (here second map task)
 {code:title=/user/mitesh/input/in2.txt}
 exit
 {code}
 Running the sample streaming MR job.
 {noformat}
 $ hadoop fs -rmr -skipTrash xyz
 $ hadoop jar $HADOOP_INSTALL/hadoop-streaming.jar -Dmapred.map.max.attempts=2 
 -Dmapred.min.split.size=7 -Dmapred.map.tasks=2 -mapper mapper.sh -file 
 mapper.sh -reducer NONE -input /user/mitesh/input/in1.txt -input 
 /user/mitesh/input/in2.txt -output xyz
 {noformat}
 Job History web UI
 {noformat}
 Hadoop Job job_201201310454_542302 on History Viewer
 User: mitesh
 JobName: streamjob7439640883203077520.jar
 JobConf: hdfs://nn:port/user/mitesh/.staging/job_201201310454_542302/job.xml
 Job-ACLs:
 mapreduce.job.acl-view-job: No users are allowed
 mapreduce.job.acl-modify-job: No users are allowed
 Submitted At: 27-Feb-2012 12:56:02
 Launched At: 27-Feb-2012 12:56:11 (8sec)
 Finished At: 27-Feb-2012 12:56:31 (20sec)
 Status: FAILED
 Failure Info: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. 
 LastFailedTask: task_201201310454_542302_m_01
 Analyse This Job
 Kind  Total Tasks(successful+failed+killed)   Successful tasksFailed 
 tasksKilled tasksStart Time  Finish Time
 Setup 1   1   0   0   27-Feb-2012 12:56:12
 27-Feb-2012 12:56:16 (4sec)
 Map   2   0   2   0   27-Feb-2012 12:56:1627-Feb-2012 
 12:56:26 (10sec)
 Reduce0   0   0   0   
 Cleanup   1   1   0   0   27-Feb-2012 12:56:26
 27-Feb-2012 12:56:31 (4sec)
 {noformat}
 Above it shows, only 2 failed tasks (belong to second map task).
 Only from JT logs, the task tracker of first map task can be found.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3926) No information of unfinished map task in Job History, if all attempts of another map task fail.

2012-02-27 Thread Amar Kamat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217228#comment-13217228
 ] 

Amar Kamat commented on MAPREDUCE-3926:
---

Mitesh,
I guess adding this to 0.20.205 might involve a lot of change. Also, the JT has 
no information about the running tasks i.e they could in fact be RUNNING, 
KILLED, FAILED, PENDING etc.

Note that this can happen for SUCCESSFUL jobs too. The job can still 
complete/finish while the speculative tasks are running. In such cases, there 
is no information about the speculative tasks logged in the job history.

This can surely be fixed in trunk.

 No information of unfinished map task in Job History, if all attempts of 
 another map task fail.
 ---

 Key: MAPREDUCE-3926
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3926
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0
Reporter: Mitesh Singh Jat
Priority: Minor

 No information of unfinished map task in Job History, if all attempts of 
 another map task fail.
 For example, 
 1. The first map task's first attempt m_00_0 was making progress
 2. The second map task failed 4 times, before completion of first map task 
 attempt.
 3. Hence, a job cleanup task was launched and completed, before completion of 
 first map task attempt.
 4. After job cleanup task, runningMapCache is cleaned
 {noformat}
 completedTask() - jobComplete() - garbageCollect() -  this.runningMapCache 
 = null;
|- retireMap() - if (runningMapCache == null) Running cache 
 for maps missing!! Job details are missing.
 {noformat}
 5. Hence, Running cache for maps missing!! Job details are missing. error 
 comes
 (from retireMap() which is called after jobComplete() ) and no information is
 added further to Job History. Therefore, first map task's information is
 missing from Job History page.
 I have created a sample streaming MR job, to reproduce this issue.
 {code:title=mapper.sh}
 #!/bin/bash
 read line
 if [[ $line == sleep ]]
 then
 for i in 1 2 3
 do
 echo Sleeping 2
 sleep 5
 done
 exit 0
 else
 echo Exiting 2
 exit -1
 fi
 {code}
 Input file: in1.txt is for long running map task (here first map task)
 {code:title=/user/mitesh/input/in1.txt}
 sleep
 {code}
 Input file: in2.txt is for failing map task (here second map task)
 {code:title=/user/mitesh/input/in2.txt}
 exit
 {code}
 Running the sample streaming MR job.
 {noformat}
 $ hadoop fs -rmr -skipTrash xyz
 $ hadoop fs -jar $HADOOP_INSTALL/hadoop-streaming.jar 
 -Dmapred.map.max.attempts=2 -Dmapred.min.split.size=7 -Dmapred.map.tasks=2 
 -mapper mapper.sh -file mapper.sh -reducer NONE -input 
 /user/mitesh/input/in1.txt -input /user/mitesh/input/in2.txt -output xyz
 {noformat}
 Job History web UI
 {noformat}
 Hadoop Job job_201201310454_542302 on History Viewer
 User: mitesh
 JobName: streamjob7439640883203077520.jar
 JobConf: hdfs://nn:port/user/mitesh/.staging/job_201201310454_542302/job.xml
 Job-ACLs:
 mapreduce.job.acl-view-job: No users are allowed
 mapreduce.job.acl-modify-job: No users are allowed
 Submitted At: 27-Feb-2012 12:56:02
 Launched At: 27-Feb-2012 12:56:11 (8sec)
 Finished At: 27-Feb-2012 12:56:31 (20sec)
 Status: FAILED
 Failure Info: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. 
 LastFailedTask: task_201201310454_542302_m_01
 Analyse This Job
 Kind  Total Tasks(successful+failed+killed)   Successful tasksFailed 
 tasksKilled tasksStart Time  Finish Time
 Setup 1   1   0   0   27-Feb-2012 12:56:12
 27-Feb-2012 12:56:16 (4sec)
 Map   2   0   2   0   27-Feb-2012 12:56:1627-Feb-2012 
 12:56:26 (10sec)
 Reduce0   0   0   0   
 Cleanup   1   1   0   0   27-Feb-2012 12:56:26
 27-Feb-2012 12:56:31 (4sec)
 {noformat}
 Above it shows, only 2 failed tasks (belong to second map task).
 Only from JT logs, the task tracker of first map task can be found.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira