[
https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408418#comment-13408418
]
Marshall McMullen commented on ZOOKEEPER-1453:
----------------------------------------------
OK, reproduced this again.... This time I made sure to capture a tar file of
the data directory on each node and there is now a zookeeper_trace.log file in
each tar file. In this scenario, I had a 3-node ensemble {10.10.5.42,
10.10.5.44, 10.10.5.123} and I rebooted 10.10.5.42 via issuing "reboot -f" on
that node. When the node comes back up, it spews the following log message as
before. Around 16:35 it is rebooted, and comes back around 16:38.
2012-07-06 16:35:51,512 [myid:3] - DEBUG
[CommitProcessor:3:FinalRequestProcessor@158][] - sessionid:0x3385e6145f2000b
type:setData cxid:0x4ff77654 zxid:0x100000d4f txntype:5 reqpath:n/a
...[REBOOTED]...
2012-07-06 16:38:38,026 [myid:] - INFO [main:QuorumPeerConfig@99][] - Reading
configuration from: /sf/data/zookeeper/10.10.5.42/10.10.5.42_2181.cfg
2012-07-06 16:38:38,038 [myid:2] - INFO [main:DatadirCleanupManager@78][] -
autopurge.snapRetainCount set to 5
2012-07-06 16:38:38,038 [myid:2] - INFO [main:DatadirCleanupManager@79][] -
autopurge.purgeInterval set to 1
2012-07-06 16:38:38,039 [myid:2] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@138][] - Purge task started.
2012-07-06 16:38:38,045 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:hiearchy=default
2012-07-06 16:38:38,045 [myid:2] - DEBUG [main:HierarchyDynamicMBean@260][] -
postRegister is called.
2012-07-06 16:38:38,046 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=root
2012-07-06 16:38:38,047 [myid:2] - DEBUG [main:LoggerDynamicMBean@258][] -
Adding AppenderMBean for appender named CONSOLE
2012-07-06 16:38:38,048 [myid:2] - DEBUG [main:AppenderDynamicMBean@158][] -
getMBeanInfo called.
2012-07-06 16:38:38,048 [myid:2] - DEBUG [main:AppenderDynamicMBean@337][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:appender=CONSOLE
2012-07-06 16:38:38,048 [myid:2] - DEBUG [main:AppenderDynamicMBean@197][] -
Adding LayoutMBean:CONSOLE,layout=org.apache.log4j.PatternLayout
2012-07-06 16:38:38,049 [myid:2] - DEBUG [main:LayoutDynamicMBean@140][] -
getMBeanInfo called.
2012-07-06 16:38:38,049 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:appender=CONSOLE,layout=org.apache.log4j.PatternLayout
2012-07-06 16:38:38,049 [myid:2] - DEBUG [main:LoggerDynamicMBean@258][] -
Adding AppenderMBean for appender named TRACEFILE
2012-07-06 16:38:38,050 [myid:2] - DEBUG [main:AppenderDynamicMBean@158][] -
getMBeanInfo called.
2012-07-06 16:38:38,050 [myid:2] - DEBUG [main:AppenderDynamicMBean@337][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:appender=TRACEFILE
2012-07-06 16:38:38,051 [myid:2] - DEBUG [main:AppenderDynamicMBean@197][] -
Adding LayoutMBean:TRACEFILE,layout=org.apache.log4j.PatternLayout
2012-07-06 16:38:38,052 [myid:2] - DEBUG [main:LayoutDynamicMBean@140][] -
getMBeanInfo called.
2012-07-06 16:38:38,052 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:appender=TRACEFILE,layout=org.apache.log4j.PatternLayout
2012-07-06 16:38:38,052 [myid:2] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@144][] - Purge task completed.
2012-07-06 16:38:38,052 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [root] as listener.
2012-07-06 16:38:38,052 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.persistence.FileSnap
2012-07-06 16:38:38,053 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.persistence.FileSnap] as listener.
2012-07-06 16:38:38,053 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.log4j.jmx.LoggerDynamicMBean
2012-07-06 16:38:38,053 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.log4j.jmx.LoggerDynamicMBean] as listener.
2012-07-06 16:38:38,053 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.jmx.MBeanRegistry
2012-07-06 16:38:38,053 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.jmx.MBeanRegistry] as listener.
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.persistence.FileTxnSnapLog
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.persistence.FileTxnSnapLog] as
listener.
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.quorum.QuorumPeerConfig
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.quorum.QuorumPeerConfig] as
listener.
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.log4j.jmx.AppenderDynamicMBean
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.log4j.jmx.AppenderDynamicMBean] as listener.
2012-07-06 16:38:38,054 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.persistence.Util
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.persistence.Util] as listener.
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.quorum.QuorumPeerMain
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.quorum.QuorumPeerMain] as
listener.
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.log4j.jmx.HierarchyDynamicMBean
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.log4j.jmx.HierarchyDynamicMBean] as listener.
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.DatadirCleanupManager
2012-07-06 16:38:38,055 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.DatadirCleanupManager] as
listener.
2012-07-06 16:38:38,056 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.log4j.jmx.LayoutDynamicMBean
2012-07-06 16:38:38,056 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.log4j.jmx.LayoutDynamicMBean] as listener.
2012-07-06 16:38:38,056 [myid:2] - DEBUG [main:AbstractDynamicMBean@148][] -
preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@49b979c0,
name=log4j:logger=org.apache.zookeeper.server.persistence.FileTxnLog
2012-07-06 16:38:38,056 [myid:2] - DEBUG [main:HierarchyDynamicMBean@128][] -
---Adding logger [org.apache.zookeeper.server.persistence.FileTxnLog] as
listener.
2012-07-06 16:38:38,056 [myid:2] - INFO [main:QuorumPeerMain@131][] - Starting
quorum peer
2012-07-06 16:38:38,072 [myid:2] - INFO [main:NIOServerCnxnFactory@108][] -
binding to port /10.10.5.42:2181
2012-07-06 16:38:38,080 [myid:2] - INFO [main:QuorumPeer@1107][] - tickTime
set to 2000
2012-07-06 16:38:38,080 [myid:2] - INFO [main:QuorumPeer@1127][] -
minSessionTimeout set to -1
2012-07-06 16:38:38,080 [myid:2] - INFO [main:QuorumPeer@1138][] -
maxSessionTimeout set to -1
2012-07-06 16:38:38,081 [myid:2] - INFO [main:QuorumPeer@1153][] - initLimit
set to 10
2012-07-06 16:38:38,093 [myid:2] - INFO [main:FileSnap@83][] - Reading
snapshot /sf/data/zookeeper/10.10.5.42/version-2/snapshot.0
2012-07-06 16:38:38,097 [myid:2] - DEBUG
[main:FileTxnLog$FileTxnIterator@575][] - Created new input stream
/sf/data/zookeeper/10.10.5.42/version-2/log.100000001
2012-07-06 16:38:38,097 [myid:2] - DEBUG
[main:FileTxnLog$FileTxnIterator@578][] - Created new input archive
/sf/data/zookeeper/10.10.5.42/version-2/log.100000001
2012-07-06 16:38:38,124 [myid:2] - DEBUG [main:DataTree@951][] - Ignoring
processTxn failure hdr: -1 : error: -110
2012-07-06 16:38:38,124 [myid:2] - DEBUG [main:FileTxnSnapLog@241][] - Ignoring
processTxn failure hdr: -1 : error: -110
2012-07-06 16:38:38,124 [myid:2] - DEBUG [main:DataTree@951][] - Ignoring
processTxn failure hdr: -1 : error: -110
2012-07-06 16:38:38,124 [myid:2] - DEBUG [main:FileTxnSnapLog@241][] - Ignoring
processTxn failure hdr: -1 : error: -110
2012-07-06 16:38:38,126 [myid:2] - DEBUG [main:DataTree@951][] - Ignoring
processTxn failure hdr: -1 : error: -110
2012-07-06 16:38:38,126 [myid:2] - DEBUG [main:FileTxnSnapLog@241][] - Ignoring
processTxn failure hdr: -1 : error: -110
... [ REPEATS FOR A LONG TIME ] ...
2012-07-06 16:38:38,432 [myid:2] - DEBUG
[main:FileTxnLog$FileTxnIterator@618][] - EOF excepton java.io.EOFException:
Failed to read /sf/data/zookeeper/10.10.5.42/version-2/log.100000001
2012-07-06 16:38:38,437 [myid:2] - INFO
[QuorumPeerListener:QuorumCnxManager$Listener@530][] - My election bind port:
/10.10.5.42:2183
2012-07-06 16:38:38,443 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumPeer@825][] - Starting quorum peer
2012-07-06 16:38:38,447 [myid:2] - INFO
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumPeer@860][] - LOOKING
2012-07-06 16:38:38,447 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumPeer@789][] - Initializing leader
election protocol...
2012-07-06 16:38:38,448 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@732][] - Updating
proposal: 2 (newleader), 0x100000d54 (newzxid), -1 (oldleader),
0xffffffffffffffff (oldzxid)
2012-07-06 16:38:38,449 [myid:2] - INFO
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@831][] - New election.
My id = 2, proposed zxid=0x100000d54
2012-07-06 16:38:38,449 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 1 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,450 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 2 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,450 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 3 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,450 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@375][] - Opening channel to server 1
2012-07-06 16:38:38,453 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@381][] - Connected to server 1
2012-07-06 16:38:38,454 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager$SendWorker@616][] - Address of remote
peer: 1
2012-07-06 16:38:38,456 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@375][] - Opening channel to server 3
2012-07-06 16:38:38,456 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@381][] - Connected to server 3
2012-07-06 16:38:38,456 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@191][] - Have smaller server identifier,
so dropping the connection: (3, 2)
2012-07-06 16:38:38,457 [myid:2] - DEBUG
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@295][] -
Receive new notification message. My id = 2
2012-07-06 16:38:38,457 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@635][] - Notification: 2 (n.leader),
0x100000d54 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1
(n.peerEPoch), LOOKING (my state)0 (n.config version)
2012-07-06 16:38:38,458 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@652][] - id: 2, proposed
id: 2, zxid: 0x100000d54, proposed zxid: 0x100000d54
2012-07-06 16:38:38,458 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@901][] - Adding vote:
from=2, proposed leader=2, proposed zxid=0x100000d54, proposed election
epoch=0x1
2012-07-06 16:38:38,659 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumCnxManager@444][] - Queue size: 0
2012-07-06 16:38:38,659 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 1 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,659 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 2 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,659 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@408][] - There is a connection already
for server 1
2012-07-06 16:38:38,659 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 3 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:38,659 [myid:2] - INFO
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@865][] - Notification
time out: 400
2012-07-06 16:38:38,660 [myid:2] - DEBUG
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@295][] -
Receive new notification message. My id = 2
2012-07-06 16:38:38,660 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@375][] - Opening channel to server 3
2012-07-06 16:38:38,660 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@381][] - Connected to server 3
2012-07-06 16:38:38,661 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@191][] - Have smaller server identifier,
so dropping the connection: (3, 2)
2012-07-06 16:38:38,661 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@635][] - Notification: 2 (n.leader),
0x100000d54 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1
(n.peerEPoch), LOOKING (my state)0 (n.config version)
2012-07-06 16:38:38,661 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@652][] - id: 2, proposed
id: 2, zxid: 0x100000d54, proposed zxid: 0x100000d54
2012-07-06 16:38:38,661 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@901][] - Adding vote:
from=2, proposed leader=2, proposed zxid=0x100000d54, proposed election
epoch=0x1
2012-07-06 16:38:39,061 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumCnxManager@444][] - Queue size: 0
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 1 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@408][] - There is a connection already
for server 1
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 2 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 3 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,062 [myid:2] - INFO
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@865][] - Notification
time out: 800
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@295][] -
Receive new notification message. My id = 2
2012-07-06 16:38:39,062 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@375][] - Opening channel to server 3
2012-07-06 16:38:39,063 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@381][] - Connected to server 3
2012-07-06 16:38:39,063 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@191][] - Have smaller server identifier,
so dropping the connection: (3, 2)
2012-07-06 16:38:39,063 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@635][] - Notification: 2 (n.leader),
0x100000d54 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1
(n.peerEPoch), LOOKING (my state)0 (n.config version)
2012-07-06 16:38:39,064 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@652][] - id: 2, proposed
id: 2, zxid: 0x100000d54, proposed zxid: 0x100000d54
2012-07-06 16:38:39,064 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@901][] - Adding vote:
from=2, proposed leader=2, proposed zxid=0x100000d54, proposed election
epoch=0x1
2012-07-06 16:38:39,864 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:QuorumCnxManager@444][] - Queue size: 0
2012-07-06 16:38:39,864 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 1 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,864 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@408][] - There is a connection already
for server 1
2012-07-06 16:38:39,864 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 2 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,864 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@583][] - Sending
Notification: 2 (n.leader), 0x100000d54 (n.zxid), 0x1 (n.round), 3 (recipient),
2 (myid), 0x1 (n.peerEpoch)
2012-07-06 16:38:39,864 [myid:2] - INFO
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@865][] - Notification
time out: 1600
2012-07-06 16:38:39,865 [myid:2] - DEBUG
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@295][] -
Receive new notification message. My id = 2
2012-07-06 16:38:39,865 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@375][] - Opening channel to server 3
2012-07-06 16:38:39,865 [myid:2] - DEBUG
[WorkerSender[myid=2]:QuorumCnxManager@381][] - Connected to server 3
2012-07-06 16:38:39,865 [myid:2] - INFO
[WorkerSender[myid=2]:QuorumCnxManager@191][] - Have smaller server identifier,
so dropping the connection: (3, 2)
2012-07-06 16:38:39,866 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@635][] - Notification: 2 (n.leader),
0x100000d54 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1
(n.peerEPoch), LOOKING (my state)0 (n.config version)
2012-07-06 16:38:39,866 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@652][] - id: 2, proposed
id: 2, zxid: 0x100000d54, proposed zxid: 0x100000d54
2012-07-06 16:38:39,866 [myid:2] - DEBUG
[QuorumPeer[myid=2]/10.10.5.42:2181:FastLeaderElection@901][] - Adding vote:
from=2, proposed leader=2, proposed zxid=0x100000d54, proposed election
epoch=0x1
2012-07-06 16:38:41,259 [myid:2] - INFO
[NIOServerCxn.Factory:/10.10.5.42:2181:NIOServerCnxnFactory@227][] - Accepted
socket connection from /10.10.5.44:50944
2012-07-06 16:38:41,263 [myid:2] - WARN
[NIOServerCxn.Factory:/10.10.5.42:2181:NIOServerCnxn@354][] - Exception causing
close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2012-07-06 16:38:41,264 [myid:2] - DEBUG
[NIOServerCxn.Factory:/10.10.5.42:2181:NIOServerCnxn@358][] - IOException stack
trace
java.io.IOException: ZooKeeperServer not running
at
org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:926)
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:229)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:238)
at java.lang.Thread.run(Thread.java:722)
Thereafter I see " Exception causing close of session 0x0 due to
java.io.IOException: ZooKeeperServer not running" over and over. See full trace
files on all three nodes in the attachments.
I really need help on this, so anything anyone can offer will be immensely
appreciated.
> corrupted logs may not be correctly identified by FileTxnIterator
> -----------------------------------------------------------------
>
> Key: ZOOKEEPER-1453
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.3.3
> Reporter: Patrick Hunt
> Priority: Critical
> Attachments: 10.10.5.123.tar.gz, 10.10.5.42.tar.gz, 10.10.5.44.tar.gz
>
>
> See ZOOKEEPER-1449 for background on this issue. The main problem is that
> during server recovery
> org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next()
> does not indicate if the available logs are valid or not. In some cases (say
> a truncated record and a single txnlog in the datadir) we will not detect
> that the file is corrupt, vs reaching the end of the file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira