Harsh- Here are all of the other values that I have configured.
hdfs-site.xml ----------------- dfs.webhdfs.enabled true dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-falover.enabled true ha.zookeeper.quorum nn.domain:2181,snn.domain:2181,jt.domain:2181 dfs.journalnode.edits.dir /opt/hdfs/data1/dfs/jn dfs.namenode.shared.edits.dir qjournal://nn.domain:8485;snn.domain:8485;jt.domain:8485/mycluster dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn.domain,snn.domain dfs.namenode.rpc-address.mycluster.nn1 nn.domain:8020 dfs.namenode.rpc-address.mycluster.nn2 snn.domain:8020 dfs.namenode.http-address.mycluster.nn1 nn.domain:50070 dfs.namenode.http-address.mycluster.nn2 snn.domain:50070 dfs.name.dir /var/lib/hadoop-hdfs/cache/hdfs/dfs/name core-site.xml ---------------- fs.trash.interval 1440 fs.trash.checkpoint.interval 1440 fs.defaultFS hdfs://mycluster dfs.datanode.data.dir /hdfs/data1,/hdfs/data2,/hdfs/data3,/hdfs/data4,/hdfs/data5,/hdfs/data6,/hdfs/data7 mapred-site.xml ---------------------- mapreduce.framework.name yarn mapreduce.jobhistory.address jt.domain:10020 mapreduce.jobhistory.webapp.address jt.domain:19888 yarn-site.xml ------------------- yarn.nodemanager.aux-service mapreduce.shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.log-aggregation-enable true yarn.nodemanager.remote-app-log-dir /var/log/hadoop-yarn/apps yarn.application.classpath $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib /*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$YARN_HOME/*,$YARN_HOME/lib/* yarn.resourcemanager.resource-tracker.address jt.domain:8031 yarn.resourcemanager.address jt.domain:8032 yarn.resourcemanager.scheduler.address jt.domain:8030 yarn.resourcemanager.admin.address jt.domain:8033 yarn.reesourcemanager.webapp.address jt.domain:8088 These are the only interesting entries in my HDFS log file when I try to start the NameNode with "service hadoop-hdfs-namenode start". WARN org.apache.hadoop.hdfs.server.common.Util: Path /var/lib/hadoop-hdfs/cache/hdfs/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration. WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories! INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Configured NNs: ((there's a blank line here implying no configured NameNodes!)) ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. Java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled. FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join Java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled. I don't like the blank line for Configured NNs. Not sure why it's not finding them. If I try the command "hdfs zkfc -formatZK" I get the following: Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. -----Original Message----- From: Smith, Joshua D. [mailto:joshua.sm...@gd-ais.com] Sent: Tuesday, August 27, 2013 7:17 AM To: user@hadoop.apache.org Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory Harsh- Yes, I intend to use HA. That's what I'm trying to configure right now. Unfortunately I cannot share my complete configuration files. They're on a disconnected network. Are there any configuration items that you'd like me to post my settings for? The deployment is CDH 4.3 on a brand new cluster. There are 3 master nodes (NameNode, StandbyNameNode, JobTracker/ResourceManager) and 7 slave nodes. Each of the master nodes is configured to be a Zookeeper node as well as a Journal node. The HA configuration that I'm striving toward is the automatic fail-over with Zookeeper. Does that help? Josh -----Original Message----- From: Harsh J [mailto:ha...@cloudera.com] Sent: Monday, August 26, 2013 6:11 PM To: <user@hadoop.apache.org> Subject: Re: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory It is not quite from your post, so a Q: Do you intend to use HA or not? Can you share your complete core-site.xml and hdfs-site.xml along with a brief note on the deployment? On Tue, Aug 27, 2013 at 12:48 AM, Smith, Joshua D. <joshua.sm...@gd-ais.com> wrote: > When I try to start HDFS I get an error in the log that says... > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem > initialization failed. > > java.io.IOException: Invalid configuration: a shared edits dir must > not be specified if HA is not enabled. > > > > I have the following properties configured as per page 12 of the CDH4 > High Availability Guide... > > http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/la > test/PDF/CDH4-High-Availability-Guide.pdf > > > > <property> > > <name>dfs.namenode.rpc-address.mycluster.nn1</name> > > <value>nn.domain:8020</value> > > </property> > > <property> > > <name>dfs.namenode.rpc-address.mycluster.nn2</name> > > <value>snn.domain:8020</value> > > </property> > > > > When I look at the Hadoop source code that generates the error message > I can see that it's failing because it's looking for > dfs.namenode.rpc-address without the suffix. I'm assuming that the > suffix gets lopped off at some point before it gets pulled up and the > property is checked for, so maybe I have the suffix wrong? > > > > In any case I can't get HDFS to start because it's looking for a > property that I don't have in the truncated for and it doesn't seem to > be finding the form of it with the suffix. Any assistance would be most > appreciated. > > > > Thanks, > > Josh -- Harsh J