In short, there are no userlogs.  stderr and stdout are both empty.  I copied 
the output from syslog to the following pastebin: http://pastebin.com/0XXE9Jze. 
 The first 22 lines look to be exactly the same as the syslogs for other, 
non-dying, tasks.   The main departure is on line 23 where the loader can't 
seem to load native-hadoop libraries, and this happens about 10 minutes after 
starting up.

--Keith

On Aug 1, 2011, at 1:00 PM, Harsh J wrote:

> Are there no userlogs from the failed tasks? TaskTracker logs won't
> carry user-code (task) logs. Could you paste those syslog lines (from
> the task) to pastebin/etc. since the lists may not be accepting
> attachments?
> 
> On Tue, Aug 2, 2011 at 12:51 AM, Stevens, Keith D. <steven...@llnl.gov> wrote:
>> Hi all,
>> 
>> I'm running a simple mapreduce job that connects to an hbase table, reads 
>> each row, counts some co-occurrence frequencies, and writes everything out 
>> to hdfs at the end.  Everything seems to be going smoothly until the last 5, 
>> out of 108, tasks run.  The last 5 tasks seem to be stuck initializing.  As 
>> far as I can tell, setup is never called, and eventually, after 600 seconds, 
>> the task is killed.  The task jumps around different nodes to try and run 
>> but regardless of the node, it fails to initialize and is killed.
>> 
>> My first guess is that it's trying to connect to an hbase region server and 
>> failing, but I don't see anything like this in the task tracker nodes.  Here 
>> are the log lines related to one of the failed tasks from the task trackers 
>> logs:
>> 
>> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: 
>> LaunchTaskAction (registerTask): attempt_201107281508_0028_m_000027_0 task's 
>> state:UNASSIGNED
>> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
>> launch : attempt_201107281508_0028_m_000027_0 which needs 1 slots
>> 2011-08-01 12:01:08,889 INFO org.apache.hadoop.mapred.TaskTracker: In 
>> TaskLauncher, current free slots : 1 and trying to launch 
>> attempt_201107281508_0028_m_000027_0 which needs 1 slots
>> 2011-08-01 12:01:12,243 INFO org.apache.hadoop.mapred.TaskTracker: JVM with 
>> ID: jvm_201107281508_0028_m_-1189914759 given task: 
>> attempt_201107281508_0028_m_000027_0
>> 2011-08-01 12:11:09,462 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201107281508_0028_m_000027_0: Task 
>> attempt_201107281508_0028_m_000027_0 failed to report status for 600 
>> seconds. Killing!
>> 2011-08-01 12:11:09,467 INFO org.apache.hadoop.mapred.TaskTracker: About to 
>> purge task: attempt_201107281508_0028_m_000027_0
>> 2011-08-01 12:11:14,488 INFO org.apache.hadoop.mapred.TaskRunner: 
>> attempt_201107281508_0028_m_000027_0 done; removing files.
>> 2011-08-01 12:11:14,489 INFO org.apache.hadoop.mapred.IndexCache: Map ID 
>> attempt_201107281508_0028_m_000027_0 not found in cache
>> 2011-08-01 12:11:14,495 INFO org.apache.hadoop.mapred.TaskTracker: 
>> LaunchTaskAction (registerTask): attempt_201107281508_0028_m_000027_0 task's 
>> state:FAILED_UNCLEAN
>> 2011-08-01 12:11:14,496 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
>> launch : attempt_201107281508_0028_m_000027_0 which needs 1 slots
>> 2011-08-01 12:11:14,496 INFO org.apache.hadoop.mapred.TaskTracker: In 
>> TaskLauncher, current free slots : 1 and trying to launch 
>> attempt_201107281508_0028_m_000027_0 which needs 1 slots
>> 2011-08-01 12:11:15,045 INFO org.apache.hadoop.mapred.TaskTracker: JVM with 
>> ID: jvm_201107281508_0028_m_-1869983962 given task: 
>> attempt_201107281508_0028_m_000027_0
>> 2011-08-01 12:11:15,346 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201107281508_0028_m_000027_0 0.0%
>> 2011-08-01 12:11:15,348 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201107281508_0028_m_000027_0 0.0% cleanup
>> 2011-08-01 12:11:15,349 INFO org.apache.hadoop.mapred.TaskTracker: Task 
>> attempt_201107281508_0028_m_000027_0 is done.
>> 2011-08-01 12:11:15,349 INFO org.apache.hadoop.mapred.TaskTracker: reported 
>> output size for attempt_201107281508_0028_m_000027_0  was -1
>> 2011-08-01 12:11:15,354 INFO org.apache.hadoop.mapred.TaskRunner: 
>> attempt_201107281508_0028_m_000027_0 done; removing files.
>> 2011-08-01 12:11:17,495 INFO org.apache.hadoop.mapred.TaskRunner: 
>> attempt_201107281508_0028_m_000027_0 done; removing files.
>> 
>> And here are the syslog lines:
>> In my job, I set the stats when i enter and exit setup, and I set counters 
>> in map.  None of these are triggered for this task.  Nothing is written to 
>> stderr or stdout, and the syslogs for the task have nothing beyond the 
>> zookeeper client connection lines.
>> 
>> Any thoughts as to what might be causing this issue?  Is there another log 
>> that indicates which region server this task is trying to connect to?
>> 
>> Thanks!
>> --Keith Stevens
> 
> 
> 
> -- 
> Harsh J

Reply via email to