Hi all,
I am currently experimenting with running a HA HDFS cluster in docker
containers. I have successfully created an HA cluster using Ubuntu as my
base image for running the namenode, datanode and journalnodes. The
zookeeper instance runs on an image built using Alpine linux as the base
and works pretty well.
I attempted to get the namenode, datanode and journalnodes running using
Alpine linux as the base image. The datanode and journalnodes seem to
work fine. However, while the namenodes start correctly, they seem to
disconnect from Zookeeper quite often and will transition into neutral
mode. This results in the namenodes being in "startup mode" forever.
These are the logs from the active namenode:
16/08/25 22:40:28 INFO blockmanagement.CacheReplicationMonitor: Starting
CacheReplicationMonitor with interval 30000 milliseconds
16/08/25 22:40:28 INFO fs.TrashPolicyDefault: Namenode trash
configuration: Deletion interval = 1440 minutes, Emptier interval = 0
minutes.
16/08/25 22:40:28 INFO fs.TrashPolicyDefault: The configured checkpoint
interval is 0 minutes. Using an interval of 1440 minutes that is used
for deletion instead
16/08/25 22:40:28 INFO blockmanagement.BlockManager: Total number of
blocks = 0
16/08/25 22:40:28 INFO blockmanagement.BlockManager: Number of invalid
blocks = 0
16/08/25 22:40:28 INFO blockmanagement.BlockManager: Number of
under-replicated blocks = 0
16/08/25 22:40:28 INFO blockmanagement.BlockManager: Number of
over-replicated blocks = 0
16/08/25 22:40:28 INFO blockmanagement.BlockManager: Number of blocks
being written = 0
16/08/25 22:40:28 INFO hdfs.StateChange: STATE* Replication Queue
initialization scan for invalid, over- and under-replicated blocks
completed in 687 msec
16/08/25 22:40:28 INFO ha.ZKFailoverController: Successfully
transitioned NameNode at m9edd51-nn1.m9edd51/172.18.0.7:8020 to active state
16/08/25 22:40:30 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
mkdir: `/tmp': File exists
16/08/25 22:40:32 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 0
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command mkdirs is: 0
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command listStatus is: 0
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command * is: 0
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command setPermission is: 0
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command mkdirs is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command listStatus is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command * is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command setPermission is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command mkdirs is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command listStatus is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command * is: 1
16/08/25 22:41:36 INFO window.RollingWindowManager: topN size for
command setPermission is: 1
16/08/25 22:41:40 INFO zookeeper.ClientCnxn: Unable to read additional
data from server sessionid 0x156c3dc679a0003, likely server has closed
socket, closing socket connection and attempting reconnect
16/08/25 22:41:40 INFO ha.ActiveStandbyElector: Session disconnected.
Entering neutral mode...
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Opening socket connection
to server m9edd51-zookeeper.m9edd51/172.18.0.2:2181. Will not attempt to
authenticate using SASL (unknown error)
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Socket connection
established to m9edd51-zookeeper.m9edd51/172.18.0.2:2181, initiating session
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Session establishment
complete on server m9edd51-zookeeper.m9edd51/172.18.0.2:2181, sessionid
= 0x156c3dc679a0003, negotiated timeout = 5000
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: Session connected.
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: Checking for any old
active which needs to be fenced...
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: Old node exists:
0a036e733112036e6e311a136d3965646435312d6e6e312e6d39656464353120d43e28d33e
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: But old node has our own
data, so don't need to fence it.
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: Writing znode
/mycluster/ns1/ActiveBreadCrumb to indicate that the local node is the
most recent active...
16/08/25 22:41:41 INFO ha.ZKFailoverController: Trying to make NameNode
at m9edd51-nn1.m9edd51/172.18.0.7:8020 active...
16/08/25 22:41:41 INFO ha.ZKFailoverController: Successfully
transitioned NameNode at m9edd51-nn1.m9edd51/172.18.0.7:8020 to active state
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 0
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command mkdirs is: 0
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command listStatus is: 0
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command * is: 0
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command setPermission is: 0
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command mkdirs is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command listStatus is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command * is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command setPermission is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command getfileinfo is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command mkdirs is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command listStatus is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command * is: 1
16/08/25 22:41:57 INFO window.RollingWindowManager: topN size for
command setPermission is: 1
16/08/25 22:42:28 INFO namenode.FSNamesystem: Roll Edit Log from 172.18.0.8
16/08/25 22:42:28 INFO namenode.FSEditLog: Rolling edit logs
16/08/25 22:42:28 INFO namenode.FSEditLog: Ending log segment 1
16/08/25 22:42:28 INFO namenode.FSEditLog: Number of transactions: 4
Total time for transactions(ms): 16 Number of transactions batched in
Syncs: 0 Number of syncs: 3 SyncTimes(ms): 40 17
16/08/25 22:42:28 INFO namenode.FSEditLog: Number of transactions: 4
Total time for transactions(ms): 16 Number of transactions batched in
Syncs: 0 Number of syncs: 4 SyncTimes(ms): 50 18
16/08/25 22:42:28 INFO namenode.FileJournalManager: Finalizing edits
file /var/lib/hadoop/name/current/edits_inprogress_0000000000000000001
->
/var/lib/hadoop/name/current/edits_0000000000000000001-0000000000000000004
16/08/25 22:42:28 INFO namenode.FSEditLog: Starting log segment at 5
From the standby namenode:
16/08/25 22:40:33 INFO ha.HealthMonitor: Entering state SERVICE_HEALTHY
16/08/25 22:40:33 INFO ha.ZKFailoverController: Local service NameNode
at m9edd51-nn2.m9edd51/172.18.0.8:8020 entered state: SERVICE_HEALTHY
16/08/25 22:40:33 INFO ha.ZKFailoverController: ZK Election indicated
that NameNode at m9edd51-nn2.m9edd51/172.18.0.8:8020 should become standby
16/08/25 22:40:33 INFO ha.ZKFailoverController: Successfully
transitioned NameNode at m9edd51-nn2.m9edd51/172.18.0.8:8020 to standby
state
16/08/25 22:41:40 INFO zookeeper.ClientCnxn: Unable to read additional
data from server sessionid 0x156c3dc679a0004, likely server has closed
socket, closing socket connection and attempting reconnect
16/08/25 22:41:40 INFO ha.ActiveStandbyElector: Session disconnected.
Entering neutral mode...
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Opening socket connection
to server m9edd51-zookeeper.m9edd51/172.18.0.2:2181. Will not attempt to
authenticate using SASL (unknown error)
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Socket connection
established to m9edd51-zookeeper.m9edd51/172.18.0.2:2181, initiating session
16/08/25 22:41:41 INFO zookeeper.ClientCnxn: Session establishment
complete on server m9edd51-zookeeper.m9edd51/172.18.0.2:2181, sessionid
= 0x156c3dc679a0004, negotiated timeout = 5000
16/08/25 22:41:41 INFO ha.ActiveStandbyElector: Session connected.
16/08/25 22:41:41 INFO ha.ZKFailoverController: ZK Election indicated
that NameNode at m9edd51-nn2.m9edd51/172.18.0.8:8020 should become standby
16/08/25 22:41:41 INFO ha.ZKFailoverController: Successfully
transitioned NameNode at m9edd51-nn2.m9edd51/172.18.0.8:8020 to standby
state
16/08/25 22:42:28 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:42:28 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@3b1f16f0
expecting start txid #1
16/08/25 22:42:28 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:42:28 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 1
16/08/25 22:42:28 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 1
16/08/25 22:42:29 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=1&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 133 edits # 4 loaded in 0 seconds
16/08/25 22:42:29 INFO ha.EditLogTailer: Loaded 4 edits starting from txid 0
16/08/25 22:44:29 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:44:29 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c92162
expecting start txid #5
16/08/25 22:44:29 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:44:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 5
16/08/25 22:44:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 5
16/08/25 22:44:29 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=5&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds
16/08/25 22:44:29 INFO ha.EditLogTailer: Loaded 2 edits starting from txid 4
16/08/25 22:46:29 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:46:29 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@1f890896
expecting start txid #7
16/08/25 22:46:29 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:46:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 7
16/08/25 22:46:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 7
16/08/25 22:46:29 INFO namenode.FSImage: Edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=7&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds
16/08/25 22:46:29 INFO ha.EditLogTailer: Loaded 2 edits starting from txid 6
16/08/25 22:48:29 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:48:29 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@4da07bde
expecting start txid #9
16/08/25 22:48:29 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:48:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 9
16/08/25 22:48:29 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 9
16/08/25 22:48:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=9&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds
16/08/25 22:48:30 INFO ha.EditLogTailer: Loaded 2 edits starting from txid 8
16/08/25 22:50:30 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:50:30 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@350cea58
expecting start txid #11
16/08/25 22:50:30 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:50:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 11
16/08/25 22:50:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 11
16/08/25 22:50:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=11&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds
16/08/25 22:50:30 INFO ha.EditLogTailer: Loaded 2 edits starting from
txid 10
16/08/25 22:52:30 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:52:30 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6a6c8f51
expecting start txid #13
16/08/25 22:52:30 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:52:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 13
16/08/25 22:52:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 13
16/08/25 22:52:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=13&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds 16/08/25 22:52:30 INFO
ha.EditLogTailer: Loaded 2 edits starting from txid 12
16/08/25 22:54:30 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:54:30 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@8212238
expecting start txid #15
16/08/25 22:54:30 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:54:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 15
16/08/25 22:54:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 15
16/08/25 22:54:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn3.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=15&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds
16/08/25 22:54:30 INFO ha.EditLogTailer: Loaded 2 edits starting from
txid 14
16/08/25 22:56:30 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:56:30 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@58d1e7d8
expecting start txid #17
16/08/25 22:56:30 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:56:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 17
16/08/25 22:56:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 17
16/08/25 22:56:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=17&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds 16/08/25 22:56:30 INFO
ha.EditLogTailer: Loaded 2 edits starting from txid 16
16/08/25 22:58:30 INFO ha.EditLogTailer: Triggering log roll on remote
NameNode m9edd51-nn1.m9edd51/172.18.0.7:8020
16/08/25 22:58:30 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@791c80ff
expecting start txid #19
16/08/25 22:58:30 INFO namenode.FSImage: Start loading edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster
16/08/25 22:58:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 19
16/08/25 22:58:30 INFO namenode.EditLogInputStream: Fast-forwarding
stream
'http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster'
to transaction ID 19
16/08/25 22:58:30 INFO namenode.FSImage: Edits file
http://m9edd51-jn1.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster,
http://m9edd51-jn2.m9edd51:8480/getJournal?jid=ns1&segmentTxId=19&storageInfo=-63%3A2072890455%3A0%3Amycluster
of size 42 edits # 2 loaded in 0 seconds 16/08/25 22:58:30 INFO
ha.EditLogTailer: Loaded 2 edits starting from txid 18
Any ideas where I can start to debug this? This seems to be a problem
specific to the zookeeper failover controller. I tried this docker image
that starts a standalone cluster with 2 namenodes, datanodes and journal
nodes, but no zookeeper failover controller and it works correctly on
Alpline linux: https://github.com/gustavonalle/yarn-docker
Cheers,
Francis
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org