Hi Amit,

1. Shared storage is used instead of direct write to standby, to allow cluster 
to be functional, even when the standby is not available. Shared storage is 
distributed, it will be functional even if one of the node (standby) fails. So 
it supports uninterrupted functionality for the user.

2. HDFS used shared storage or journal node to avoiding the “split-brain” 
syndrome, where multiple namenodes think they’re in charge of the cluster. 
JournalNodes node will allow only one active namenode to write the edits logs.
For more info you can check the HDFS document 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

Regards
Surendra


From: Amit Kabra [mailto:amitkabrai...@gmail.com]
Sent: 27 February 2017 10:29
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Journal nodes , QJM requirement

Hi Hadoop Users,

I have one question, didn't get information on internet.

Why hadoop needs journaling system. In order to sync Active / Standby NN, 
instead of using Journal node or any shared system, can't it do master-slave or 
multi master replication where for any write master will write to other 
master/slave as well and only once replication is done at other sites will 
commit / accept the write ?

One reason I could think is journal node writes data from NN in append only 
mode which might make it faster as compared to writing to slave / another 
master for replication but I am not sure.

Any pointers ?

Thanks,
Amit Kabra.

Reply via email to