Yes, there is a JIRA issue for a redundant JobTracker already. The NN redundancy scenario is mentioned on the Wiki (look for SecondaryNameNode).
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Ryan LeCompte <lecom...@gmail.com> > To: "core-user@hadoop.apache.org" <core-user@hadoop.apache.org> > Sent: Friday, January 9, 2009 3:09:13 PM > Subject: Job Tracker/Name Node redundancy > > Are there any plans to build redundancy/failover support for the Job > Tracker and Name Node components in Hadoop? Let's take the current > scenario: > > 1) A data/cpu intensive job is submitted to a Hadoop cluster of 10 machines. > 2) Half-way through the job execution, the Job Tracker or Name Node fails. > 3) We bring up a new Job Tracker or Name Node manually. > > -- Will the individual task trackers / data nodes "reconnect" to the > new masters? Or will the job have to be resubmitted? If we had > failover support, we could setup essentially 3 Job Tracker masters and > 3 NameNode masters so that if one dies the other would gracefully take > over and start handling results from the children nodes. > > Thanks! > > Ryan