[jira] Resolved: (HADOOP-1087) Reducer hangs pulling from incorrect file.out.index path. (when one of the mapred.local.dir is not accessible but becomes available later at reduce time)

Devaraj Das (JIRA) Mon, 07 May 2007 05:27:36 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Devaraj Das resolved HADOOP-1087.
---------------------------------

    Resolution: Won't Fix

Resolving this for now. The comments made earlier explain the reason. Also, 
HADOOP-1252 should take care of this situation it it ever happens. In case the 
problem appears even with the fix for HADOOP-1252, then we can reopen this.

> Reducer hangs pulling from incorrect file.out.index path. (when one of the 
> mapred.local.dir is not accessible but becomes available later at reduce time)
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1087
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.10.1
>            Reporter: Koji Noguchi
>
> 2007-03-07 23:14:23,431 WARN org.apache.hadoop.mapred.TaskRunner: 
> java.io.IOException: Server returned HTTP response code: 500 for URL: 
> http://____:____/mapOutput?map=task_7810_m_000897_0&reduce=397
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1149)
>   at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:121)
>   at 
> org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:236)
>   at 
> org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:199)
> 2007-03-07 23:14:23,431 WARN org.apache.hadoop.mapred.TaskRunner: 
> task_7810_r_000397_0 adding host ____.com to penalty box, next contact in 279 
> seconds
> This happened when one of the drives was full and not accessible at map time.
> and one mapper
>     public void mergeParts() throws IOException {
>       ...
>       Path finalIndexFile = mapOutputFile.getOutputIndexFile(getTaskId());
> failed on the first hash entry in mapred.local.dir and used the second entry
> Afterwards, first dir entry became available and when reducer tried to pull 
> through,
>     public static class MapOutputServlet extends HttpServlet {
>       ...
>       Path indexFileName = conf.getLocalPath(mapId+"/file.out.index");
> it used the first entry.
> As a result, directory was empty and reducer kept on trying to pull from the 
> incorrect path and hang.
> (wasn't sure if this is a duplicate of HADOOP-895 since it is not 
> reproducible unless I get disk failure.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HADOOP-1087) Reducer hangs pulling from incorrect file.out.index path. (when one of the mapred.local.dir is not accessible but becomes available later at reduce time)

Reply via email to