[ https://issues.apache.org/jira/browse/ZOOKEEPER-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283481#comment-14283481 ]
Hadoop QA commented on ZOOKEEPER-2101: -------------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693209/ZOOKEEPER-2101-v2.diff against trunk revision 1646992. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2483//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2483//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2483//console This message is automatically generated. > Transaction larger than max buffer of jute makes zookeeper unavailable > ---------------------------------------------------------------------- > > Key: ZOOKEEPER-2101 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2101 > Project: ZooKeeper > Issue Type: Bug > Components: jute > Affects Versions: 3.4.4 > Reporter: Liu Shaohui > Attachments: ZOOKEEPER-2101-v1.diff, ZOOKEEPER-2101-v2.diff, test.diff > > > *Problem* > For multi operation, PrepRequestProcessor may produce a large transaction > whose size may be larger than the max buffer size of jute. There is check of > buffer size in readBuffer method of BinaryInputArchive, but no check in > writeBuffer method of BinaryOutputArchive, which will cause that > 1, Leader can sync transaction to txn log and send the large transaction to > the followers, but the followers failed to read the transaction and can't > sync with leader. > {code} > 2015-01-04,12:42:26,474 WARN org.apache.zookeeper.server.quorum.Learner: > [myid:2] Exception when following the leader > java.io.IOException: Unreasonable length = 2054758 > at > org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > at > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > 2015-01-04,12:42:26,475 INFO org.apache.zookeeper.server.quorum.Learner: > [myid:2] shutdown called > java.lang.Exception: shutdown Follower > at > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) > {code} > 2, The leader lose all followers, which trigger the leader election. The old > leader will become leader again for it has up-to-date data. > {code} > 2015-01-04,12:42:28,502 INFO org.apache.zookeeper.server.quorum.Leader: > [myid:3] Shutting down > 2015-01-04,12:42:28,502 INFO org.apache.zookeeper.server.quorum.Leader: > [myid:3] Shutdown called > java.lang.Exception: shutdown Leader! reason: Only 1 followers, need 2 > at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:496) > at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:471) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:753) > {code} > 3, The leader can not load the transaction from the txn log for the length of > data is larger than the max buffer of jute. > {code} > 2015-01-04,12:42:31,282 ERROR org.apache.zookeeper.server.quorum.QuorumPeer: > [myid:3] Unable to load database on disk > java.io.IOException: Unreasonable length = 2054758 > at > org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100) > at > org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233) > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:602) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:157) > at > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417) > at > org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546) > at > org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690) > at > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716) > {code} > The zookeeper service will be unavailable until we enlarge the jute.maxbuffer > and restart zookeeper hbase cluster. > *Solution* > Add buffer size check in BinaryOutputArchive to avoid large transaction be > written to log and sent to followers. > But I am not sure if there are side-effects of throwing an IOException in > BinaryOutputArchive and RequestProcessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)