I found the terminology of primary and secondary to be a bit confusing in describing operation after a failure scenario. Perhaps it is helpful to think that the Hadoop instance is guided to select a node as primary for normal operation. If that node fails, then the backup becomes the new primary. In analyzing traffic it appears that the restored node does not become primary again until the whole instance restarts. I myself would welcome clarification on this observed behavior.
*.......* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well preserved body, but rather to skid in broadside in a cloud of smoke,thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Dec 12, 2014 at 7:56 AM, Rich Haase <rha...@pandora.com> wrote: > The remaining cluster services will continue to run. That way when the > namenode (or other failed processes) is restored the cluster will resume > healthy operation. This is part of hadoop’s ability to handle network > partition events. > > *Rich Haase* | Sr. Software Engineer | Pandora > m 303.887.1146 | rha...@pandora.com > > From: Chandrashekhar Kotekar <shekhar.kote...@gmail.com> > Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org> > Date: Friday, December 12, 2014 at 3:57 AM > To: "user@hadoop.apache.org" <user@hadoop.apache.org> > Subject: What happens to data nodes when name node has failed for long > time? > > Hi, > > What happens if name node has crashed for more than one hour but > secondary name node, all the data nodes, job tracker, task trackers are > running fine? Do those daemon services also automatically shutdown after > some time? Or those services keep running hoping for namenode to come back? > > Regards, > Chandrash3khar Kotekar > Mobile - +91 8600011455 >