Re: Maps running after reducers complete successfully?
On Oct 3, 2008, at 12:20 PM, Billy Pearson wrote: Do we not have an option to store the map results in hdfs? It might be possible eventually, but not soon. The performance would be lower and it would substantially stress the NameNode. -- Owen
Re: Maps running after reducers complete successfully?
Do we not have an option to store the map results in hdfs? Billy "Owen O'Malley" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] It isn't optimal, but it is the expected behavior. In general when we lose a TaskTracker, we want the map outputs regenerated so that any reduces that need to re-run (including speculative execution). We could handle it as a special case if: 1. We didn't lose any running reduces. 2. All of the reduces (including speculative tasks) are done with shuffling. 3. We don't plan on launching any more speculative reduces. If all 3 hold, we don't need to re-run the map tasks. Actually doing so, would be a pretty involved patch to the JobTracker/Schedulers. -- Owen
Re: Maps running after reducers complete successfully?
thanks Owen, So this may be an enhancement? - Prasad. On Thursday 02 October 2008 09:58:03 pm Owen O'Malley wrote: > It isn't optimal, but it is the expected behavior. In general when we > lose a TaskTracker, we want the map outputs regenerated so that any > reduces that need to re-run (including speculative execution). We > could handle it as a special case if: >1. We didn't lose any running reduces. >2. All of the reduces (including speculative tasks) are done with > shuffling. >3. We don't plan on launching any more speculative reduces. > If all 3 hold, we don't need to re-run the map tasks. Actually doing > so, would be a pretty involved patch to the JobTracker/Schedulers. > > -- Owen
Re: Maps running after reducers complete successfully?
It isn't optimal, but it is the expected behavior. In general when we lose a TaskTracker, we want the map outputs regenerated so that any reduces that need to re-run (including speculative execution). We could handle it as a special case if: 1. We didn't lose any running reduces. 2. All of the reduces (including speculative tasks) are done with shuffling. 3. We don't plan on launching any more speculative reduces. If all 3 hold, we don't need to re-run the map tasks. Actually doing so, would be a pretty involved patch to the JobTracker/Schedulers. -- Owen
Maps running after reducers complete successfully?
Hello, The following is the scenario which caused this weird behavior. Is this expected? - All maps tasks first completed successfully - Then all the reducers except 1 were completed. - While the last reduce task was running one of the tasktrackers died. - This made all the map tasks executed at that node to be moved to failed - The jobtracker re-assigned these map tasks to other nodes and the map status is running. - The last reduce task finished execution, so I have reduce 100% but maps running. Is this correct to have maps running after reducers are completed? - Prasad Pingali. IIIT Hyderabad.