Are there no userlogs from the failed tasks? TaskTracker logs won't
carry user-code (task) logs. Could you paste those syslog lines (from
the task) to pastebin/etc. since the lists may not be accepting
attachments?

On Tue, Aug 2, 2011 at 12:51 AM, Stevens, Keith D. <steven...@llnl.gov> wrote:
> Hi all,
>
> I'm running a simple mapreduce job that connects to an hbase table, reads 
> each row, counts some co-occurrence frequencies, and writes everything out to 
> hdfs at the end.  Everything seems to be going smoothly until the last 5, out 
> of 108, tasks run.  The last 5 tasks seem to be stuck initializing.  As far 
> as I can tell, setup is never called, and eventually, after 600 seconds, the 
> task is killed.  The task jumps around different nodes to try and run but 
> regardless of the node, it fails to initialize and is killed.
>
> My first guess is that it's trying to connect to an hbase region server and 
> failing, but I don't see anything like this in the task tracker nodes.  Here 
> are the log lines related to one of the failed tasks from the task trackers 
> logs:
>
> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: 
> LaunchTaskAction (registerTask): attempt_201107281508_0028_m_000027_0 task's 
> state:UNASSIGNED
> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
> launch : attempt_201107281508_0028_m_000027_0 which needs 1 slots
> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: In 
> TaskLauncher, current free slots : 1 and trying to launch 
> attempt_201107281508_0028_m_000027_0 which needs 1 slots
> 2011-08-01 12:01:12,243 INFO org.apache.hadoop.mapred.TaskTracker: JVM with 
> ID: jvm_201107281508_0028_m_-1189914759 given task: 
> attempt_201107281508_0028_m_000027_0
> 2011-08-01 12:11:09,462 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201107281508_0028_m_000027_0: Task 
> attempt_201107281508_0028_m_000027_0 failed to report status for 600 seconds. 
> Killing!
> 2011-08-01 12:11:09,467 INFO org.apache.hadoop.mapred.TaskTracker: About to 
> purge task: attempt_201107281508_0028_m_000027_0
> 2011-08-01 12:11:14,488 INFO org.apache.hadoop.mapred.TaskRunner: 
> attempt_201107281508_0028_m_000027_0 done; removing files.
> 2011-08-01 12:11:14,489 INFO org.apache.hadoop.mapred.IndexCache: Map ID 
> attempt_201107281508_0028_m_000027_0 not found in cache
> 2011-08-01 12:11:14,495 INFO org.apache.hadoop.mapred.TaskTracker: 
> LaunchTaskAction (registerTask): attempt_201107281508_0028_m_000027_0 task's 
> state:FAILED_UNCLEAN
> 2011-08-01 12:11:14,496 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
> launch : attempt_201107281508_0028_m_000027_0 which needs 1 slots
> 2011-08-01 12:11:14,496 INFO org.apache.hadoop.mapred.TaskTracker: In 
> TaskLauncher, current free slots : 1 and trying to launch 
> attempt_201107281508_0028_m_000027_0 which needs 1 slots
> 2011-08-01 12:11:15,045 INFO org.apache.hadoop.mapred.TaskTracker: JVM with 
> ID: jvm_201107281508_0028_m_-1869983962 given task: 
> attempt_201107281508_0028_m_000027_0
> 2011-08-01 12:11:15,346 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201107281508_0028_m_000027_0 0.0%
> 2011-08-01 12:11:15,348 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201107281508_0028_m_000027_0 0.0% cleanup
> 2011-08-01 12:11:15,349 INFO org.apache.hadoop.mapred.TaskTracker: Task 
> attempt_201107281508_0028_m_000027_0 is done.
> 2011-08-01 12:11:15,349 INFO org.apache.hadoop.mapred.TaskTracker: reported 
> output size for attempt_201107281508_0028_m_000027_0  was -1
> 2011-08-01 12:11:15,354 INFO org.apache.hadoop.mapred.TaskRunner: 
> attempt_201107281508_0028_m_000027_0 done; removing files.
> 2011-08-01 12:11:17,495 INFO org.apache.hadoop.mapred.TaskRunner: 
> attempt_201107281508_0028_m_000027_0 done; removing files.
>
> And here are the syslog lines:
> In my job, I set the stats when i enter and exit setup, and I set counters in 
> map.  None of these are triggered for this task.  Nothing is written to 
> stderr or stdout, and the syslogs for the task have nothing beyond the 
> zookeeper client connection lines.
>
> Any thoughts as to what might be causing this issue?  Is there another log 
> that indicates which region server this task is trying to connect to?
>
> Thanks!
> --Keith Stevens



-- 
Harsh J

Reply via email to