On Wed, Jul 28, 2021 at 5:20 AM Damien Diederen <ddiede...@apache.org>
wrote:

>
> Hi Li, all,
>
> > When load testing the write operation against Zookeeper 3.7.0, I
> observed a
> > couple of times that the server crashed because the txn log was too large
> > and it was not able to load it.
>
> Difficult to say without more details, but I suspect ZOOKEEPER-4306 to
> be the culprit:
>
>     https://issues.apache.org/jira/browse/ZOOKEEPER-4306


Yes, ZOOKEEPER-4306 could be the culprit. In my write operation test, all
the nodes were created as ephemeral.

>
> Would it be possible for you to share the transaction log ZooKeeper
> fails to load?
>

I observed the issues a couple of times about one month ago. I tried to
investigate this issue more recently, but was not able to reproduce it.
I remembered I read the txn log file using zkTxnLogToolkit.sh and it was
similar to what's mentioned in the
https://issues.apache.org/jira/browse/ZOOKEEPER-4306.
Unfortunately, I didn't save the txn log.

I will save the txn log if I can reproduce it or it happens again.

Thanks,

Li


> HTH, -D
>
>
>
> --8<---------------original message------------->8---
>
> Li Wang <li4w...@gmail.com> writes:
> > Hi,
> >
> >
> >
> > When load testing the write operation against Zookeeper 3.7.0, I
> observed a
> > couple of times that the server crashed because the txn log was too large
> > and it was not able to load it. However the data size of write is only 4
> > bytes in the load test and the *jute.maxbuffer *was set to default (i.e.
> > 1M). The error doesn't always happen.
> >
> >
> > I wonder if anyone has also seen this error or has any idea on what may
> > cause the issue?
> >
> >
> > StackTrace
> >
> > =========
> >
> >
> > 2021-07-01 16:02:00,837 [myid:3] - ERROR [main:QuorumPeerMain@114] -
> > Unexpected exception, exiting abnormally
> >
> > java.lang.RuntimeException: Unable to run quorum server
> >
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1200)
> >
> > at
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1131)
> >
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229)
> >
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137)
> >
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91)
> >
> > Caused by: java.io.IOException: Unreasonable length = 3175014
> >
> > at
> >
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:166)
> >
> > at
> >
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:127)
> >
> > at
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:159)
> >
> > at
> >
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:749)
> >
> > at
> >
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:361)
> >
> > at
> >
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:267)
> >
> > at
> >
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:312)
> >
> > at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:287)
> >
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1145)
> >
> >
> > Thanks,
> >
> >
> > Li
>

Reply via email to