RE: Job Tracker/Name Node redundancy

2009-01-09 Thread Amar Kamat
Ryan,
>From the MR (JobTracker) side we have a failover support. 
If a large job is submitted and the JobTracker fails midway then you can start 
the JobTracker on the same host and resume
the job. Look at https://issues.apache.org/jira/browse/HADOOP-3245 for more 
details. Hope that helps.

Amar


-Original Message-
From: Ryan LeCompte [mailto:lecom...@gmail.com]
Sent: Fri 1/9/2009 12:09 PM
To: core-user@hadoop.apache.org
Subject: Job Tracker/Name Node redundancy
 
Are there any plans to build redundancy/failover support for the Job
Tracker and Name Node components in Hadoop? Let's take the current
scenario:

1) A data/cpu intensive job is submitted to a Hadoop cluster of 10 machines.
2) Half-way through the job execution, the Job Tracker or Name Node fails.
3) We bring up a new Job Tracker or Name Node manually.

-- Will the individual task trackers / data nodes "reconnect" to the
new masters? Or will the job have to be resubmitted? If we had
failover support, we could setup essentially 3 Job Tracker masters and
3 NameNode masters so that if one dies the other would gracefully take
over and start handling results from the children nodes.

Thanks!

Ryan



Re: Job Tracker/Name Node redundancy

2009-01-09 Thread Jeff Hammerbacher
Hey Ryan,

Some specific JIRA tickets that will help narrow your search:

JT: https://issues.apache.org/jira/browse/HADOOP-4586
NN: https://issues.apache.org/jira/browse/HADOOP-4539

Would love to hear your thoughts there!

Regards,
Jeff

On Fri, Jan 9, 2009 at 12:36 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Yes, there is a JIRA issue for a redundant JobTracker already.
> The NN redundancy scenario is mentioned on the Wiki (look for
> SecondaryNameNode).
>
> Otis
>
>  --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Ryan LeCompte 
> > To: "core-user@hadoop.apache.org" 
> > Sent: Friday, January 9, 2009 3:09:13 PM
> > Subject: Job Tracker/Name Node redundancy
> >
> > Are there any plans to build redundancy/failover support for the Job
> > Tracker and Name Node components in Hadoop? Let's take the current
> > scenario:
> >
> > 1) A data/cpu intensive job is submitted to a Hadoop cluster of 10
> machines.
> > 2) Half-way through the job execution, the Job Tracker or Name Node
> fails.
> > 3) We bring up a new Job Tracker or Name Node manually.
> >
> > -- Will the individual task trackers / data nodes "reconnect" to the
> > new masters? Or will the job have to be resubmitted? If we had
> > failover support, we could setup essentially 3 Job Tracker masters and
> > 3 NameNode masters so that if one dies the other would gracefully take
> > over and start handling results from the children nodes.
> >
> > Thanks!
> >
> > Ryan
>
>


Re: Job Tracker/Name Node redundancy

2009-01-09 Thread Otis Gospodnetic
Yes, there is a JIRA issue for a redundant JobTracker already.
The NN redundancy scenario is mentioned on the Wiki (look for 
SecondaryNameNode).

Otis

 --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Ryan LeCompte 
> To: "core-user@hadoop.apache.org" 
> Sent: Friday, January 9, 2009 3:09:13 PM
> Subject: Job Tracker/Name Node redundancy
> 
> Are there any plans to build redundancy/failover support for the Job
> Tracker and Name Node components in Hadoop? Let's take the current
> scenario:
> 
> 1) A data/cpu intensive job is submitted to a Hadoop cluster of 10 machines.
> 2) Half-way through the job execution, the Job Tracker or Name Node fails.
> 3) We bring up a new Job Tracker or Name Node manually.
> 
> -- Will the individual task trackers / data nodes "reconnect" to the
> new masters? Or will the job have to be resubmitted? If we had
> failover support, we could setup essentially 3 Job Tracker masters and
> 3 NameNode masters so that if one dies the other would gracefully take
> over and start handling results from the children nodes.
> 
> Thanks!
> 
> Ryan