The core-site.xml configuration settings will be overridden by
hdfs-site.xml, mapred-site.xml, yarn-site.xml. This was like that but don't
know if it is changed now.

Look at your shared.edits.dir configuration. You have not set it correct
across name nodes.

Regards


On Tue, 3 Oct 2023, 1:59 pm Harry Jamison, <harryjamiso...@yahoo.com.invalid>
wrote:

> Liming
>
> After looking at my config, I think that maybe my problem is because my 
> fs.defaultFS
> is inconsistent between hdfs-site.xml and core-site.xml
> What does hdfs-site.xml vs core-site.xml do why is the same setting in 2
> different places?
> Or do I just have it there mistakenly?
>
> this is what I have in hdfs-site.xml
>
> <?xml version="1.0" encoding="UTF-8"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>   <property>
>       <name>fs.defaultFS</name>
>       <value>hdfs://mycluster</value>
>    </property>
>   <property>
>     <name>ha.zookeeper.quorum</name>
>     <value>nn1:2181,nn2:2181,nn3:2181</value>
>   </property>
>
>   <property>
>     <name>dfs.nameservices</name>
>     <value>mycluster</value>
>   </property>
>
>   <property>
>     <name>dfs.ha.namenodes.mycluster</name>
>     <value>nn1,nn2,nn3</value>
>   </property>
>
>   <property>
>     <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>     <value>nn1:8020</value>
>   </property>
>   <property>
>     <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>     <value>nn2:8020</value>
>   </property>
>   <property>
>     <name>dfs.namenode.rpc-address.mycluster.nn3</name>
>     <value>nn3:8020</value>
>   </property>
>
>   <property>
>     <name>dfs.namenode.http-address.mycluster.nn1</name>
>     <value>nn1:9870</value>
>   </property>
>   <property>
>     <name>dfs.namenode.http-address.mycluster.nn2</name>
>     <value>nn2:9870</value>
>   </property>
>   <property>
>     <name>dfs.namenode.http-address.mycluster.nn3</name>
>     <value>nn3:9870</value>
>   </property>
>
>   <property>
>     <name>dfs.namenode.shared.edits.dir</name>
>     <value>qjournal://nn1:8485;nn2:8485;nn3:8485/mycluster</value>
>   </property>
>   <property>
>     <name>dfs.client.failover.proxy.provider.mycluster</name>
>
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>   </property>
>
>   <property>
>     <name>dfs.ha.fencing.methods</name>
>     <value>sshfence</value>
>   </property>
>
>   <property>
>     <name>dfs.ha.fencing.ssh.private-key-files</name>
>     <value>/home/harry/.ssh/id_rsa</value>
>   </property>
>
>   <property>
>     <name>dfs.namenode.name.dir</name>
>     <value>file:/hadoop/data/hdfs/namenode</value>
>   </property>
>   <property>
>     <name>dfs.datanode.data.dir</name>
>     <value>file:/hadoop/data/hdfs/datanode</value>
>   </property>
>   <property>
>     <name>dfs.journalnode.edits.dir</name>
>     <value>/hadoop/data/hdfs/journalnode</value>
>   </property>
>   <property>
>     <name>dfs.namenode.rpc-address</name>
>     <value>nn1:8020</value>
>   </property>
>
>   <property>
>     <name>dfs.ha.nn.not-become-active-in-safemode</name>
>     <value>true</value>
>   </property>
>
> </configuration>
>
>
>
> In core-site.xml I have this
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!--
>
>   Licensed under the Apache License, Version 2.0 (the "License");
>
>   you may not use this file except in compliance with the License.
>
>   You may obtain a copy of the License at
>
>
>     http://www.apache.org/licenses/LICENSE-2.0
>
>
>   Unless required by applicable law or agreed to in writing, software
>
>   distributed under the License is distributed on an "AS IS" BASIS,
>
>   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>
>   See the License for the specific language governing permissions and
>
>   limitations under the License. See accompanying LICENSE file.
>
> -->
>
>
> <!-- Put site-specific property overrides in this file. -->
>
>
> <configuration>
>
>   <property>
>
>     <name>fs.defaultFS</name>
>
>     <value>hdfs://nn1:8020</value>
>
>   </property>
>
>
> </configuration>
>
>
> On Tuesday, October 3, 2023 at 12:54:26 AM PDT, Liming Cui <
> anyone.cui...@gmail.com> wrote:
>
>
> Can you show us the configuration files?
> Maybe I can help you with some suggestions.
>
>
> On Tue, Oct 3, 2023 at 9:05 AM Harry Jamison
> <harryjamiso...@yahoo.com.invalid> wrote:
>
> I am trying to setup a HA HDFS cluster, and I am running into a problem
>
> I am not sure what I am doing wrong, I thought I followed the HA namenode
> guide, but it is not working.
>
>
> Apache Hadoop 3.3.6 – HDFS High Availability
> <https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html>
>
>
>
> I have 2 namenodes and 3 journal nodes, and 3 zookeeper nodes.
>
> After some period of time I see the following and my namenode and journal
> node die.
> I am not sure where the problem is, or how to diagnose what I am doing
> wrong here.  And the logging here does not make sense to me.
>
> Namenode
> Serving checkpoints at http://nn1:9870
> (org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer)
>
> real-time non-blocking time  (microseconds, -R) unlimited
>
> core file size              (blocks, -c) 0
>
> data seg size               (kbytes, -d) unlimited
>
> scheduling priority                 (-e) 0
>
> file size                   (blocks, -f) unlimited
>
> pending signals                     (-i) 15187
>
> max locked memory           (kbytes, -l) 8192
>
> max memory size             (kbytes, -m) unlimited
>
> open files                          (-n) 1024
>
> pipe size                (512 bytes, -p) 8
>
> POSIX message queues         (bytes, -q) 819200
>
> real-time priority                  (-r) 0
>
> stack size                  (kbytes, -s) 8192
>
> cpu time                   (seconds, -t) unlimited
>
> max user processes                  (-u) 15187
>
> virtual memory              (kbytes, -v) unlimited
>
> file locks                          (-x) unlimited
>
> [2023-10-02 23:53:46,693] ERROR RECEIVED SIGNAL 15: SIGTERM
> (org.apache.hadoop.hdfs.server.namenode.NameNode)
>
> [2023-10-02 23:53:46,701] INFO SHUTDOWN_MSG:
>
> /************************************************************
>
> SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.1.159
>
> ************************************************************/
> (org.apache.hadoop.hdfs.server.namenode.NameNode)
>
> JournalNode
> [2023-10-02 23:54:19,162] WARN Journal at nn1/192.168.1.159:8485 has no
> edit logs (org.apache.hadoop.hdfs.qjournal.server.JournalNodeSyncer)
>
> real-time non-blocking time  (microseconds, -R) unlimited
>
> core file size              (blocks, -c) 0
>
> data seg size               (kbytes, -d) unlimited
>
> scheduling priority                 (-e) 0
>
> file size                   (blocks, -f) unlimited
>
> pending signals                     (-i) 15187
>
> max locked memory           (kbytes, -l) 8192
>
> max memory size             (kbytes, -m) unlimited
>
> open files                          (-n) 1024
>
> pipe size                (512 bytes, -p) 8
>
> POSIX message queues         (bytes, -q) 819200
>
> real-time priority                  (-r) 0
>
> stack size                  (kbytes, -s) 8192
>
> cpu time                   (seconds, -t) unlimited
>
> max user processes                  (-u) 15187
>
> virtual memory              (kbytes, -v) unlimited
>
> file locks                          (-x) unlimited
>
>
>
>
> --
> *Best*
>
> Liming
>

Reply via email to