[
https://issues.apache.org/jira/browse/ZOOKEEPER-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ramnatthan Alagappan updated ZOOKEEPER-2553:
--------------------------------------------
Description:
I am running a three node ZooKeeper cluster.
When a new log file is created by ZooKeeper, I see the following sequence of
system calls:
1. creat(new_log)
2. write(new_log, count=16) // This is a log header I believe/
3. truncate(new_log, from 16 bytes to 16 KBytes) // I have configured the log
size to be 16K.
When the above sequence of operations complete, it is reasonable to expect the
newly created log file to contain the header(16 bytes) and then filled with
zeros till the end of the log.
But when a crash occurs (due to a power failure), while the truncate system
call is in progress, it is possible for the log to contain garbage data when
the system restarts from the crash. Note that if the crash occurs just after
the truncate system call completes, then there is no problem. Basically, the
truncate needs to be atomically persisted for ZooKeeper to recover from crashes
correctly or (more realistically) the recovery code needs to deal with the
case of expecting garbage in a newly created log.
As mentioned, if a crash occurs during the truncate system call, then ZooKeeper
will fail to start with the following exception. Here is the stack trace:
java.io.IOException: Unreasonable length = -295704495
at
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
[myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting
abnormally
java.lang.RuntimeException: Unable to run quorum server
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.io.IOException: Unreasonable length = -295704495
at
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
... 4 more
Next, it is possible for two nodes of a 3-node ZooKeeper cluster to reach the
same state. In that case, they both will fail to startup, rendering the entire
cluster unavailable.
was:
I am running a three node ZooKeeper cluster.
When a new log file is created by ZooKeeper, I see the following sequence of
system calls:
1. creat(new_log)
2. write(new_log, count=16) // This is a log header I believe/
3. truncate(new_log, from 16 bytes to 16 KBytes) // I have configured the log
size to be 16K.
When the above sequence of operations complete, it is reasonable to expect the
newly created log file to contain the header(16 bytes) and then filled with
zeros till the end of the log.
But when a crash occurs (due to a power failure), while the truncate system
call is in progress, it is possible for the log to contain garbage data when
the system restarts from a crash. Note that if the crash occurs just after the
truncate system call completes, then there is no problem. Basically, the
truncate needs to be atomically persisted for ZooKeeper to recover from crashes
correctly.
As mentioned, if a crash occurs during the truncate system call, then ZooKeeper
will fail to start with the following exception. Here is the stack trace:
java.io.IOException: Unreasonable length = -295704495
at
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
[myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting
abnormally
java.lang.RuntimeException: Unable to run quorum server
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.io.IOException: Unreasonable length = -295704495
at
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
at
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
... 4 more
Next, it is possible for two nodes of a 3-node ZooKeeper cluster to reach the
same state. In that case, they both will fail to startup, rendering the entire
cluster unavailable.
> ZooKeeper cluster unavailable due to corrupted log file during power failures
> -- java.io.IOException: Unreasonable length
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-2553
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2553
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.8
> Environment: Normal ZooKeeper cluster with 3 nodes running Linux
> Reporter: Ramnatthan Alagappan
>
> I am running a three node ZooKeeper cluster.
> When a new log file is created by ZooKeeper, I see the following sequence of
> system calls:
> 1. creat(new_log)
> 2. write(new_log, count=16) // This is a log header I believe/
> 3. truncate(new_log, from 16 bytes to 16 KBytes) // I have configured the log
> size to be 16K.
> When the above sequence of operations complete, it is reasonable to expect
> the newly created log file to contain the header(16 bytes) and then filled
> with zeros till the end of the log.
> But when a crash occurs (due to a power failure), while the truncate system
> call is in progress, it is possible for the log to contain garbage data when
> the system restarts from the crash. Note that if the crash occurs just after
> the truncate system call completes, then there is no problem. Basically, the
> truncate needs to be atomically persisted for ZooKeeper to recover from
> crashes correctly or (more realistically) the recovery code needs to deal
> with the case of expecting garbage in a newly created log.
> As mentioned, if a crash occurs during the truncate system call, then
> ZooKeeper will fail to start with the following exception. Here is the stack
> trace:
> java.io.IOException: Unreasonable length = -295704495
> at
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
> at
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
> at
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
> at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting
> abnormally
> java.lang.RuntimeException: Unable to run quorum server
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> Caused by: java.io.IOException: Unreasonable length = -295704495
> at
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
> at
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
> at
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
> at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
> ... 4 more
> Next, it is possible for two nodes of a 3-node ZooKeeper cluster to reach
> the same state. In that case, they both will fail to startup, rendering the
> entire cluster unavailable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)