Hi Amit, 1. Shared storage is used instead of direct write to standby, to allow cluster to be functional, even when the standby is not available. Shared storage is distributed, it will be functional even if one of the node (standby) fails. So it supports uninterrupted functionality for the user.
2. HDFS used shared storage or journal node to avoiding the “split-brain” syndrome, where multiple namenodes think they’re in charge of the cluster. JournalNodes node will allow only one active namenode to write the edits logs. For more info you can check the HDFS document https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html Regards Surendra From: Amit Kabra [mailto:amitkabrai...@gmail.com] Sent: 27 February 2017 10:29 To: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Journal nodes , QJM requirement Hi Hadoop Users, I have one question, didn't get information on internet. Why hadoop needs journaling system. In order to sync Active / Standby NN, instead of using Journal node or any shared system, can't it do master-slave or multi master replication where for any write master will write to other master/slave as well and only once replication is done at other sites will commit / accept the write ? One reason I could think is journal node writes data from NN in append only mode which might make it faster as compared to writing to slave / another master for replication but I am not sure. Any pointers ? Thanks, Amit Kabra.
