[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anupam Seth updated MAPREDUCE-2177:
-----------------------------------

    Summary: Lacking progress update in combiner phase can cause map task 
timeouts  (was: The wait for spill completion should call 
Condition.awaitNanos(long nanosTimeout))

> Lacking progress update in combiner phase can cause map task timeouts
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2177
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2177
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Ted Yu
>            Assignee: Anupam Seth
>
> We sometimes saw maptask timeout in cdh3b2. Here is log from one of the 
> maptasks:
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: Spilling map 
> output: buffer full= true
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: bufstart = 
> 119534169; bufend = 59763857; bufvoid = 298844160
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: kvstart = 
> 438913; kvend = 585320; length = 983040
> 2010-11-04 10:34:41,615 INFO org.apache.hadoop.mapred.MapTask: Finished spill 
> 3
> 2010-11-04 10:35:45,352 INFO org.apache.hadoop.mapred.MapTask: Spilling map 
> output: buffer full= true
> 2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: bufstart = 
> 59763857; bufend = 298837899; bufvoid = 298844160
> 2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: kvstart = 
> 585320; kvend = 731585; length = 983040
> 2010-11-04 10:45:41,289 INFO org.apache.hadoop.mapred.MapTask: Finished spill 
> 4
> Note how long the last spill took.
> In MapTask.java, the following code waits for spill to finish:
> while (kvstart != kvend) { reporter.progress(); spillDone.await(); }
> In trunk code, code is similar.
> There is no timeout mechanism for Condition.await(). In case the SpillThread 
> takes long before calling spillDone.signal(), we would see timeout.
> Condition.awaitNanos(long nanosTimeout) should be called.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to