Ok, That's explains a lot! Thanks guys! :)

2011/9/29 Joey Echeverria <j...@cloudera.com>

> > The question is: the intermediary (before any reducer) results of
> completed
> > individual tasks are recorded in the HDFS, right? So why are these
> results
> > discarded, since the lost of the tasktracker is not the lost of already
> > processed data?
>
> Intermediate results are stored on the local disks and served up via
> an embedded jetty HTTP server. If the tasktracker goes down, so does
> the embedded HTTP server.
>
> -Joey
>
> On Thu, Sep 29, 2011 at 12:59 PM, Leonardo Gamas
> <leoga...@jusbrasil.com.br> wrote:
> > No, the reducers are fine, or at least i didn't observe any problem.
> >
> > The question is: the intermediary (before any reducer) results of
> completed
> > individual tasks are recorded in the HDFS, right? So why are these
> results
> > discarded, since the lost of the tasktracker is not the lost of already
> > processed data?
> >
> > --Leonardo Gamas
> >
> > 2011/9/29 Robert Evans <ev...@yahoo-inc.com>
> >>
> >> If a TaskTracker is lost then it cannot serve up any Map results to
> >> Reducers that will need them so the Map tasks have to be rerun.  I am
> not
> >> sure if this is the behavior you are seeing or not.  Are completed
> Reducers
> >> being rerun as well?
> >>
> >> --Bobby Evans
> >>
> >> On 9/29/11 11:15 AM, "Leonardo Gamas" <leoga...@jusbrasil.com.br>
> wrote:
> >>
> >> Hi,
> >>
> >> I have a very large MapReduce Job and sometimes a TaskTracker doesn't
> send
> >> a heartbeat in the preconfigured amount of time, so it's considered
> dead.
> >> It's ok, but all tasks already finished by this TaskTracker are lost
> too, or
> >> better explained, are rescheduled and re-executed by another
> TaskTracker.
> >>
> >> This is a default behavior or i'm experiencing some bug or miss
> >> configuration?
> >>
> >> My reguards,
> >>
> >> Leonardo Gamas
> >>
> >>
> >
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Reply via email to