Re: Maps running after reducers complete successfully?

2008-10-03 Thread Owen O'Malley


On Oct 3, 2008, at 12:20 PM, Billy Pearson wrote:


Do we not have an option to store the map results in hdfs?


It might be possible eventually, but not soon. The performance would  
be lower and it would substantially stress the NameNode.


-- Owen


Re: Maps running after reducers complete successfully?

2008-10-03 Thread Billy Pearson

Do we not have an option to store the map results in hdfs?

Billy

"Owen O'Malley" <[EMAIL PROTECTED]> wrote in 
message news:[EMAIL PROTECTED]
It isn't optimal, but it is the expected behavior. In general when we 
lose a TaskTracker, we want the map outputs regenerated so that any 
reduces that need to re-run (including speculative execution). We  could 
handle it as a special case if:

  1. We didn't lose any running reduces.
  2. All of the reduces (including speculative tasks) are done with 
shuffling.

  3. We don't plan on launching any more speculative reduces.
If all 3 hold, we don't need to re-run the map tasks. Actually doing  so, 
would be a pretty involved patch to the JobTracker/Schedulers.


-- Owen






Re: Maps running after reducers complete successfully?

2008-10-03 Thread pvvpr
thanks Owen,
  So this may be an enhancement?

- Prasad.

On Thursday 02 October 2008 09:58:03 pm Owen O'Malley wrote:
> It isn't optimal, but it is the expected behavior. In general when we
> lose a TaskTracker, we want the map outputs regenerated so that any
> reduces that need to re-run (including speculative execution). We
> could handle it as a special case if:
>1. We didn't lose any running reduces.
>2. All of the reduces (including speculative tasks) are done with
> shuffling.
>3. We don't plan on launching any more speculative reduces.
> If all 3 hold, we don't need to re-run the map tasks. Actually doing
> so, would be a pretty involved patch to the JobTracker/Schedulers.
>
> -- Owen







Re: Maps running after reducers complete successfully?

2008-10-02 Thread Owen O'Malley
It isn't optimal, but it is the expected behavior. In general when we  
lose a TaskTracker, we want the map outputs regenerated so that any  
reduces that need to re-run (including speculative execution). We  
could handle it as a special case if:

  1. We didn't lose any running reduces.
  2. All of the reduces (including speculative tasks) are done with  
shuffling.

  3. We don't plan on launching any more speculative reduces.
If all 3 hold, we don't need to re-run the map tasks. Actually doing  
so, would be a pretty involved patch to the JobTracker/Schedulers.


-- Owen


Maps running after reducers complete successfully?

2008-10-01 Thread pvvpr
Hello,
  The following is the scenario which caused this weird behavior. Is this
expected?

  - All maps tasks first completed successfully
  - Then all the reducers except 1 were completed.
  - While the last reduce task was running one of the tasktrackers died.
  - This made all the map tasks executed at that node to be moved to failed
  - The jobtracker re-assigned these map tasks to other nodes and the map
status is running.
  - The last reduce task finished execution, so I have reduce 100% but
maps running.

Is this correct to have maps running after reducers are completed?

- Prasad Pingali.
IIIT Hyderabad.