[ 
https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577509#action_12577509
 ] 

Devaraj Das commented on HADOOP-2119:
-------------------------------------

bq. The only problem is that of the reducer-scheduling from the JT. The maps 
finish so fast that the map load is always low and the reducers always start 
after the maps are done. Simple tricks of increasing the number of task 
completion events, jetty threads etc might help but wont provide a scalable 
solution. So it seems that tweaking the load logic in the JT i.e 
getNewTaskForTaskTracker() is the only way. 

The load logic seems to be there by design and is there even in the existing 
codebase. Since the maps are really small and they complete really fast (even 
before the scheduled tasktracker heartbeat interval), the tasktracker always 
reports with countMapTasks() = 0. Thus they always get a map task. Increasing 
the number of taskcompletion events or the Jetty threads will not help here 
since the reducers are not even launched. If we decide to tweak the load logic 
it should be done as a separate Jira IMO. 

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap 
> space limit).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to