[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/402 I think there is a better way to do this. Picking a new port for each test seems to be a workaround for failing to clean up the resources utilized by a failing test. It would be better if we cleaned up properly after a test execution. Since this is the method we used to fix the test in another branch I think it is up to the discretion of the committers, but it would be nice to not leak any resources from our tests. ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/402 @anmolnar my understanding was that tests for branch-3.4 do not run in parallel. Let me know if I'm incorrect. ---
[GitHub] zookeeper issue #403: Zookeeper 2923
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/403 The change looks good to me, although it would be nice if you could clean up the commit log associated with the PR. ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/402 @anmolnar You are correct that the command is executed with argument that would allow the tests to run in parallel but if you take a look at the build.xml on branch-3.4 it looks like the code to use "test.junit.threads" is not there. I believe the JIRA to add this was https://issues.apache.org/jira/browse/ZOOKEEPER-2183 which shows the change only touched 3.5 and master. ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/402 @anmolnar It would be great to have a complete understanding of the root problem. Can you try changing the patch to cleanup after each test to see what happens? ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/409#discussion_r147500264 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogNoServerTest.java --- @@ -0,0 +1,175 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.test; + +import org.apache.jute.BinaryInputArchive; +import org.apache.jute.BinaryOutputArchive; +import org.apache.jute.Record; +import org.apache.zookeeper.ZKTestCase; +import org.apache.zookeeper.ZooDefs.OpCode; +import org.apache.zookeeper.common.Time; +import org.apache.zookeeper.server.DataNode; +import org.apache.zookeeper.server.DataTree; +import org.apache.zookeeper.server.persistence.FileHeader; +import org.apache.zookeeper.server.persistence.FileTxnLog; +import org.apache.zookeeper.server.persistence.FileTxnSnapLog; +import org.apache.zookeeper.txn.*; +import org.junit.Assert; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.io.File; +import java.io.FileInputStream; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.List; + +/** + * These tests has been extracted from LoadFromTest, because they don't need test ZK server to run. --- End diff -- "have been" ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/409#discussion_r147500211 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogNoServerTest.java --- @@ -0,0 +1,175 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.test; + +import org.apache.jute.BinaryInputArchive; +import org.apache.jute.BinaryOutputArchive; +import org.apache.jute.Record; +import org.apache.zookeeper.ZKTestCase; +import org.apache.zookeeper.ZooDefs.OpCode; +import org.apache.zookeeper.common.Time; +import org.apache.zookeeper.server.DataNode; +import org.apache.zookeeper.server.DataTree; +import org.apache.zookeeper.server.persistence.FileHeader; +import org.apache.zookeeper.server.persistence.FileTxnLog; +import org.apache.zookeeper.server.persistence.FileTxnSnapLog; +import org.apache.zookeeper.txn.*; --- End diff -- We generally avoid * imports ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/409#discussion_r147499687 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -70,24 +52,37 @@ private static final int TOTAL_TRANSACTIONS = NUM_MESSAGES + TRANSACTION_OVERHEAD; private volatile boolean connected; -/** - * test that all transactions from the Log are loaded, and only once - * @throws Exception an exception might be thrown here - */ -@Test -public void testLoad() throws Exception { +private ZooKeeper zk; +private ServerCnxnFactory f; --- End diff -- i know you didn't pick this variable name but I think it wouldn't be the worst thing to slip a change to this under the radar. ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/409#discussion_r147499282 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -18,49 +18,31 @@ package org.apache.zookeeper.test; -import java.io.ByteArrayOutputStream; -import java.io.File; -import java.io.FileInputStream; -import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.ArrayList; -import java.util.List; - -import org.apache.zookeeper.common.Time; -import org.apache.jute.BinaryInputArchive; -import org.apache.jute.BinaryOutputArchive; -import org.apache.jute.Record; -import org.apache.zookeeper.CreateMode; +import org.apache.zookeeper.*; --- End diff -- We generally avoid * imports ---
[GitHub] zookeeper issue #410: [ZOOKEEPER-2338] - set SOCK_CLOEXEC on socket if defin...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/410 Hi @fr0stbyte I see in the JIRA that this ticked arises from an issue related to running ZooKeeper Mesos. Would you be able to explain exactly why this is needed? ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147818798 --- Diff: src/java/main/org/apache/jute/BinaryOutputArchive.java --- @@ -115,6 +115,11 @@ public void writeBuffer(byte barr[], String tag) out.writeInt(-1); return; } + if (barr.length >= BinaryInputArchive.maxBuffer) { --- End diff -- maybe we could also use UNREASONABLE_LENGTH from BinaryInputArchive? ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147811470 --- Diff: src/java/main/org/apache/jute/BinaryOutputArchive.java --- @@ -115,6 +115,11 @@ public void writeBuffer(byte barr[], String tag) out.writeInt(-1); return; } + if (barr.length >= BinaryInputArchive.maxBuffer) { --- End diff -- I'm not sure how valid this check is. The output buffer could contain other data that makes it larger than the maxBuffer even if barr is small right? Wouldn't it make more sense to track how much data has been written to "out"? ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147798914 --- Diff: src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java --- @@ -905,10 +907,22 @@ protected void pRequest(Request request) throws RequestProcessorException { request.setTxn(new ErrorTxn(Code.MARSHALLINGERROR.intValue())); } } +checkProposalSize(request); request.zxid = zks.getZxid(); nextProcessor.processRequest(request); } +private void checkProposalSize(Request request) { +if (request.getHdr() == null) return; +byte[] data = SerializeUtils.serializeRequest(request); --- End diff -- I'm concerned about possible performance implications here. Are we serializing this request an extra time here? ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147816281 --- Diff: src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java --- @@ -905,10 +907,22 @@ protected void pRequest(Request request) throws RequestProcessorException { request.setTxn(new ErrorTxn(Code.MARSHALLINGERROR.intValue())); } } +checkProposalSize(request); request.zxid = zks.getZxid(); nextProcessor.processRequest(request); } +private void checkProposalSize(Request request) { +if (request.getHdr() == null) return; +byte[] data = SerializeUtils.serializeRequest(request); +if (data.length > BinaryInputArchive.maxBuffer) { +LOG.error("Len error {}, larger than max buffer: {} set by jute.maxbuffer", + data.length, BinaryInputArchive.maxBuffer); --- End diff -- maybe there is a more appropriate place for maxBuffer since it applies to input and output archive ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147803340 --- Diff: src/java/main/org/apache/zookeeper/server/util/SerializeUtils.java --- @@ -148,4 +152,20 @@ public static void serializeSnapshot(DataTree dt,OutputArchive oa, dt.serialize(oa, "tree"); } +public static byte[] serializeRequest(Request request) { +if (request == null || request.getHdr() == null) return null; --- End diff -- are there valid situations where this could return null? ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/412#discussion_r147815941 --- Diff: src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java --- @@ -905,10 +907,22 @@ protected void pRequest(Request request) throws RequestProcessorException { request.setTxn(new ErrorTxn(Code.MARSHALLINGERROR.intValue())); } } +checkProposalSize(request); request.zxid = zks.getZxid(); nextProcessor.processRequest(request); } +private void checkProposalSize(Request request) { +if (request.getHdr() == null) return; +byte[] data = SerializeUtils.serializeRequest(request); +if (data.length > BinaryInputArchive.maxBuffer) { --- End diff -- why do we need this if we are also checking how much data has been written when writing to the binary output archive? ---
[GitHub] zookeeper pull request #411: ZOOKEEPER-2684 Fix a crashing bug in the mixed ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/411#discussion_r147854977 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/CommitProcessorConcurrencyTest.java --- @@ -376,4 +377,123 @@ public void noStarvationOfReadRequestsTest() throws Exception { !processedRequests.contains(r)); } } + +/** + * In the following test, we verify that we can handle the case that we got a commit + * of a request we never seen since the session that we just established. This can happen + * when a session is just established and there is request waiting to be committed in the + * in the session queue but it sees a commit for a request that belongs to the previous connection. --- End diff -- typo: "in the in the" ---
[GitHub] zookeeper pull request #411: ZOOKEEPER-2684 Fix a crashing bug in the mixed ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/411#discussion_r147883484 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -255,24 +255,23 @@ public void run() { // If session queue != null, then it is also not empty. Request topPending = sessionQueue.poll(); if (request.cxid != topPending.cxid) { -LOG.error( -"Got cxid 0x" -+ Long.toHexString(request.cxid) -+ " expected 0x" + Long.toHexString( -topPending.cxid) -+ " for client session id " -+ Long.toHexString(request.sessionId)); -throw new IOException("Error: unexpected cxid for" -+ "client session"); +// we can get commit requests that are not at the queue head after +// a session moved (see ZOOKEEPER-2684). We will just pass the +// commit to the next processor and put the pending back with +// a warning, we should not see this often under normal load +LOG.warn("Got request " + request + +" but we are expecting request " + topPending); +sessionQueue.addFirst(topPending); +} else { +/* + * We want to send our version of the request. the + * pointer to the connection in the request + */ +topPending.setHdr(request.getHdr()); --- End diff -- Would you mind explaining why we normally want to send our version of the request and why it is ok not to in this case? ---
[GitHub] zookeeper pull request #411: ZOOKEEPER-2684 Fix a crashing bug in the mixed ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/411#discussion_r148115810 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -246,33 +246,51 @@ public void run() { } /* - * Check if request is pending, if so, update it with the - * committed info + * Check if request is pending, if so, update it with the committed info */ LinkedList sessionQueue = pendingRequests .get(request.sessionId); if (sessionQueue != null) { // If session queue != null, then it is also not empty. Request topPending = sessionQueue.poll(); if (request.cxid != topPending.cxid) { -LOG.error( -"Got cxid 0x" -+ Long.toHexString(request.cxid) -+ " expected 0x" + Long.toHexString( -topPending.cxid) -+ " for client session id " -+ Long.toHexString(request.sessionId)); -throw new IOException("Error: unexpected cxid for" -+ "client session"); +// TL;DR - we should not encounter this scenario often under normal load. +// We pass the commit to the next processor and put the pending back with a warning. + +// Generally, we can get commit requests that are not at the queue head after +// a session moved (see ZOOKEEPER-2684). Let's denote the previous server of the session +// with A, and the server that the session moved to with B (keep in mind that it is +// possible that the session already moved from B to a new server C, and maybe C=A). +// 1. If request.cxid < topPending.cxid : this means that the session requested this update +// from A, then moved to B (i.e., which is us), and now B receives the commit +// for the update after the session already performed several operations in B +// (and therefore its cxid is higher than that old request). +// 2. If request.cxid > topPending.cxid : this means that the session requested an updated +// from B with cxid that is bigger than the one we know therefore in this case we +// are A, and we lost the connection to the session. Given that we are waiting for a commit +// for that update, it means that we already sent the request to the leader and it will +// be committed at some point (in this case the order of cxid won't follow zxid, since zxid +// is an increasing order). It is not safe for us to delete the session's queue at this +// point, since it is possible that the session has newer requests in it after it moved +// back to us. We just leave the queue as it is, and once the commit arrives (for the old +// request), the finalRequestProcessor will see a closed cnxn handle, and just won't send a +// response. +// Also note that we don't have a local session, therefore we treat the request +// like any other commit for a remote request, i.e., we perform the update without sending +// a response. + +LOG.warn("Got request " + request + +" but we are expecting request " + topPending); +sessionQueue.addFirst(topPending); +} else { +// We want to send to the next processor our version of the request, +// since it contains the session information that is needed +// for post update processing (e.g., using request.cnxn we send a response to the client). +topPending.setHdr(request.getHdr()); --- End d
[GitHub] zookeeper pull request #411: ZOOKEEPER-2684 Fix a crashing bug in the mixed ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/411#discussion_r148113899 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -246,33 +246,51 @@ public void run() { } /* - * Check if request is pending, if so, update it with the - * committed info + * Check if request is pending, if so, update it with the committed info */ LinkedList sessionQueue = pendingRequests .get(request.sessionId); if (sessionQueue != null) { // If session queue != null, then it is also not empty. Request topPending = sessionQueue.poll(); if (request.cxid != topPending.cxid) { -LOG.error( -"Got cxid 0x" -+ Long.toHexString(request.cxid) -+ " expected 0x" + Long.toHexString( -topPending.cxid) -+ " for client session id " -+ Long.toHexString(request.sessionId)); -throw new IOException("Error: unexpected cxid for" -+ "client session"); +// TL;DR - we should not encounter this scenario often under normal load. --- End diff -- I hate to be "that guy" but we generally use "/*" comments for long blocks like this. ---
[GitHub] zookeeper issue #410: [ZOOKEEPER-2338] - set SOCK_CLOEXEC on socket if defin...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/410 @fr0stbyte Thanks for linking to the Mesos bug. Unfortunately, I am not familiar with the way Mesos handles processes and I think it is possible that other members of the community share my ignorance. Would you mind describing the conditions that are necessary to reproduce this bug? ---
[GitHub] zookeeper pull request #414: ZOOKEEPER-2932 Performance enhancement about pu...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/414#discussion_r149206744 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java --- @@ -255,30 +255,19 @@ private void padFile(FileOutputStream out) throws IOException { */ public static File[] getLogFiles(File[] logDirList,long snapshotZxid) { List files = Util.sortDataDir(logDirList, "log", true); --- End diff -- couldn't we make the last param false and then we can use a simpler `for (File f : files)` loop? ---
[GitHub] zookeeper issue #414: ZOOKEEPER-2932 Performance enhancement about purging t...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/414 I got curious and took a look, it seems like this is related to `sortDataDir` and the way it handles files that have a different "prefix" than what is passed in. ``` * @param prefix files not matching this prefix are assumed to have a * version = -1) ``` Not sure what the best way to handle this would be. I think throwing in a check while iterating getLogFiles may be the simplest way to go. Maybe you could think of something cleaner. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149830970 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderStatsTest.java --- @@ -0,0 +1,39 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.junit.Test; + +import static org.junit.Assert.assertEquals; + +public class LeaderStatsTest { +@Test +public void testSetProposalSize_SetMinMax() { +LeaderStats stats = new LeaderStats(); --- End diff -- Maybe we should check what the min and max are before a proposal is received to make sure we set that behavior in stone. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149832062 --- Diff: src/java/test/org/apache/zookeeper/server/util/SerializeUtilsTest.java --- @@ -0,0 +1,73 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.util; + +import org.apache.jute.OutputArchive; +import org.apache.jute.Record; +import org.apache.zookeeper.server.Request; +import org.apache.zookeeper.txn.TxnHeader; +import org.junit.Test; +import org.mockito.InOrder; + +import java.io.IOException; + +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertNull; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.inOrder; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.verify; + +public class SerializeUtilsTest { + +@Test +public void testSerializeRequest_RequestIsNull() { +byte[] data = SerializeUtils.serializeRequest(null); +assertNull(data); +} + +@Test +public void testSerializeRequest_RequestHeaderIsNull() { +Request request = new Request(0, 0, 0, null, null, 0); +byte[] data = SerializeUtils.serializeRequest(request); +assertNull(data); +} + +@Test +public void testSerializeRequest_WithoutTxn() throws IOException { +TxnHeader header = mock(TxnHeader.class); +Request request = new Request(1, 2, 3, header, null, 4); +byte[] data = SerializeUtils.serializeRequest(request); +assertNotNull(data); +verify(header).serialize(any(OutputArchive.class), eq("hdr")); +} + +@Test +public void testSerializeRequest_WithTxn() throws IOException { +Record txn = mock(Record.class); +TxnHeader header = mock(TxnHeader.class); +Request request = new Request(1, 2, 3, header, txn, 4); +byte[] data = SerializeUtils.serializeRequest(request); +assertNotNull(data); +InOrder inOrder = inOrder(header, txn); --- End diff -- I'm concerned that we do not actually check that the result of serializing the header and transaction ever make it into `data`. Would it be possible to write this test to mock out the serialization of the `header` and `txn` then make sure we get the `data` we expect? ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149830882 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderStatsTest.java --- @@ -0,0 +1,39 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.junit.Test; + +import static org.junit.Assert.assertEquals; + +public class LeaderStatsTest { +@Test +public void testSetProposalSize_SetMinMax() { --- End diff -- nit: the underscore in the test name is a little unconventional I think ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149833628 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/LeaderStats.java --- @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +/** + * Provides live statistics about a running Leader. + */ +class LeaderStats { +/** + * Size of the last generated proposal. This should fit into server's jute.maxbuffer setting. + */ +private int lastProposalSize = 0; + +/** + * Size of the smallest proposal which has been generated since the server was started. + */ +private int minProposalSize = -1; + +/** + * Size of the largest proposal which has been generated since the server was started. + */ +private int maxProposalSize = 0; --- End diff -- nit: can we set all of these to -1 for consistency ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149832324 --- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java --- @@ -167,4 +167,9 @@ public String getSecureClientAddress() { public long getTxnLogElapsedSyncTime() { return zks.getTxnLogElapsedSyncTime(); } + +@Override +public int getJuteMaxBufferSize() { +return Integer.getInteger("jute.maxbuffer", 0xf); --- End diff -- I think the magic number 0xf is defined elsewhere in the code, it would be great to only have it in one place. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149832748 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/LeaderStats.java --- @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +/** + * Provides live statistics about a running Leader. + */ +class LeaderStats { --- End diff -- nit: Maybe we can call this `ProposalStats` at least it until it includes something not related to proposals ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149829950 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Leader.java --- @@ -984,8 +990,6 @@ public void commitAndActivate(long zxid, long designatedLeader) { /** * Create an inform packet and send it to all observers. - * @param zxid - * @param proposal --- End diff -- Not that it adds much value, but why does this line need to be removed? ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149829634 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Leader.java --- @@ -34,12 +33,12 @@ import java.util.List; import java.util.Map; import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; --- End diff -- unused import ---
[GitHub] zookeeper pull request #417: ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/417 ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 implementation from branch-3.5 to trunk You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2935 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/417.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #417 commit f211a512566b43375d76443392157b231e6ba9a1 Author: Abraham Fine Date: 2017-11-06T22:48:36Z init ---
[GitHub] zookeeper pull request #419: ZOOKEEPER-2849: Exponential back-off retry for ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/419#discussion_r151532589 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java --- @@ -946,7 +946,8 @@ protected Election createElectionAlgorithm(int electionAlgorithm){ le = new AuthFastLeaderElection(this, true); break; case 3: -qcm = new QuorumCnxManager(this); +qcm = new QuorumCnxManager(this, +ExponentialBackoffStrategy.builder().build()); --- End diff -- Perhaps we should find a way to make this pluggable? Perhaps we can read which backoff strategy to use from a java system property? ---
[GitHub] zookeeper pull request #419: ZOOKEEPER-2849: Exponential back-off retry for ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/419#discussion_r151560463 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java --- @@ -946,7 +946,8 @@ protected Election createElectionAlgorithm(int electionAlgorithm){ le = new AuthFastLeaderElection(this, true); break; case 3: -qcm = new QuorumCnxManager(this); +qcm = new QuorumCnxManager(this, +ExponentialBackoffStrategy.builder().build()); --- End diff -- For instance, it may be nice to have the default strategy perform the existing behavior. ---
[GitHub] zookeeper pull request #419: ZOOKEEPER-2849: Exponential back-off retry for ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/419#discussion_r151560170 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/ExponentialBackoffStrategyTest.java --- @@ -0,0 +1,180 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.zookeeper.server.quorum; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.assertTrue; + +import org.junit.Test; + +/** + * Unit tests for {@link ExponentialBackoffStrategy}. + */ +public class ExponentialBackoffStrategyTest { --- End diff -- This should extend ZKTestCase ---
[GitHub] zookeeper pull request #419: ZOOKEEPER-2849: Exponential back-off retry for ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/419#discussion_r151553425 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/ExponentialBackoffStrategy.java --- @@ -0,0 +1,192 @@ +package org.apache.zookeeper.server.quorum; + +/** + * A {@link BackoffStrategy} that increases the wait time between each + * interval up to the configured maximum wait time. + */ +public class ExponentialBackoffStrategy implements BackoffStrategy { + +// Sensible default values to use if not set by the user +private static final long DEFAULT_INITIAL_BACKOFF_MILLIS = 500L; // 0.5s +private static final long DEFAULT_MAX_BACKOFF_MILLIS = 30_000L; // 30s +private static final long DEFAULT_MAX_ELAPSED_MILLIS = 5 * 60_000L; // 10m +private static final double DEFAULT_BACKOFF_MULTIPLE = 1.5; + +// internal values per instance +private final long initialBackoffMillis; +private final long maxBackoffMillis; +private final long maxElapsedMillis; +private final double backoffMultiple; + +// internal state +private long nextWait; +private long totalElapsed; +private final boolean limitBackoffMillis; +private final boolean checkElapsedTime; + +/** + * Construct a new instance. + * @param builder the Builder to use for configuring this BackoffStrategy + */ +private ExponentialBackoffStrategy(Builder builder) { +this.initialBackoffMillis = builder.initialBackoffMillis; +this.maxBackoffMillis = builder.maxBackoffMillis; +this.maxElapsedMillis = builder.maxElapsedMillis; +this.backoffMultiple = builder.backoffMultiple; + +if(maxBackoffMillis == -1) { +limitBackoffMillis = false; +} else { +limitBackoffMillis = true; +} + +if(maxElapsedMillis == -1) { +checkElapsedTime = false; +} else { +checkElapsedTime = true; +} + +reset(); +} + + +@Override +public long nextWaitMillis() throws IllegalStateException { +// check if we have exceeded the allowed maximum elapsed time +if(checkElapsedTime && totalElapsed > maxElapsedMillis) { +return BackoffStrategy.STOP; +} + +long waitMillis = nextWait; + +// calculate the next wait milliseconds +nextWait = Math.round(nextWait * backoffMultiple); + +// don't exceed the allowed maximum wait milliseconds +// if a maximum was configured +if(limitBackoffMillis && nextWait > maxBackoffMillis) { +nextWait = maxBackoffMillis; +} + +// track total elapsed time, even if we don't wait we have to assume +// that some amount of time passed outside of the wait or we'll never +// hit the elapsed time limit +totalElapsed += waitMillis != 0 ? waitMillis : 1L; +return waitMillis; +} + +@Override +public void reset() { +nextWait = this.initialBackoffMillis; +totalElapsed = 0; +} + +/** + * + * @return a new {@link Builder} instance. + */ +public static Builder builder() { +return new Builder(); +} + +/** + * Builder for instances of {@link ExponentialBackoffStrategy}. + */ +public static final class Builder { --- End diff -- I'm not sure a builder is the best way to handle this. Since I think it would be nice to have much of this be user configurable, perhaps we can just pull the config from system properties? ---
[GitHub] zookeeper pull request #418: [zookeeper-2937] disallow client requests witho...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/418#discussion_r151568061 --- Diff: src/java/main/org/apache/zookeeper/server/auth/ProviderRegistry.java --- @@ -87,4 +113,10 @@ public static String listProviders() { } return sb.toString(); } + +public static SortedSet getSchemesRequiringAuth() { --- End diff -- what if the user decided which schemes were needed rather than letting each individual scheme choose if it is essential? ---
[GitHub] zookeeper pull request #418: [zookeeper-2937] disallow client requests witho...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/418#discussion_r151563002 --- Diff: src/java/main/org/apache/zookeeper/server/ServerCnxn.java --- @@ -85,6 +86,14 @@ public boolean removeAuthInfo(Id id) { return authInfo.remove(id); } +public boolean isAuthCheckComplete() { +return authCheckCompleted; +}; --- End diff -- nit: please remove the semicolon after the close bracket ---
[GitHub] zookeeper issue #273: ZOOKEEPER-2795 Change log level for "ZKShutdownHandler...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/273 @phunt I'm trying to understand the problem you are describing. Please let me know what I am missing. Since `QuorumZooKeeperServer` reimplements `setState` without `ZooKeeperServerShutdownHandler` I think this is only an issue when dealing with standalone ZooKeeper servers. Standalone servers are only instantiated (outside of tests) in `ZooKeeperServerMain#runFromConfig` (which also does the work of setting the `ZooKeeperServerShutdownHandler`). In other words, I think we are dealing with an unfortunate usage of inheritance. It would be ideal to add this `ZooKeeperServerShutdownHandler` in a subclass of `ZooKeeperServer` used only for standalone purposes but it looks like we did the opposite, putting in the `ZooKeeperServerShutdownHandler` in the default case and reimplementing `setState` for the purpose of ignoring the `ZooKeeperServerShutdownHandler` it in `QuorumZooKeeperServer`. When the `ZooKeeperServerShutdownHandler` is needed it is always set (in production code). So I guess my question is, what problem would an operator be debugging with this log message? ---
[GitHub] zookeeper pull request #419: ZOOKEEPER-2849: Exponential back-off retry for ...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/419#discussion_r151812103 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/ExponentialBackoffStrategy.java --- @@ -0,0 +1,192 @@ +package org.apache.zookeeper.server.quorum; + +/** + * A {@link BackoffStrategy} that increases the wait time between each + * interval up to the configured maximum wait time. + */ +public class ExponentialBackoffStrategy implements BackoffStrategy { + +// Sensible default values to use if not set by the user +private static final long DEFAULT_INITIAL_BACKOFF_MILLIS = 500L; // 0.5s +private static final long DEFAULT_MAX_BACKOFF_MILLIS = 30_000L; // 30s +private static final long DEFAULT_MAX_ELAPSED_MILLIS = 5 * 60_000L; // 10m +private static final double DEFAULT_BACKOFF_MULTIPLE = 1.5; + +// internal values per instance +private final long initialBackoffMillis; +private final long maxBackoffMillis; +private final long maxElapsedMillis; +private final double backoffMultiple; + +// internal state +private long nextWait; +private long totalElapsed; +private final boolean limitBackoffMillis; +private final boolean checkElapsedTime; + +/** + * Construct a new instance. + * @param builder the Builder to use for configuring this BackoffStrategy + */ +private ExponentialBackoffStrategy(Builder builder) { +this.initialBackoffMillis = builder.initialBackoffMillis; +this.maxBackoffMillis = builder.maxBackoffMillis; +this.maxElapsedMillis = builder.maxElapsedMillis; +this.backoffMultiple = builder.backoffMultiple; + +if(maxBackoffMillis == -1) { +limitBackoffMillis = false; +} else { +limitBackoffMillis = true; +} + +if(maxElapsedMillis == -1) { +checkElapsedTime = false; +} else { +checkElapsedTime = true; +} + +reset(); +} + + +@Override +public long nextWaitMillis() throws IllegalStateException { +// check if we have exceeded the allowed maximum elapsed time +if(checkElapsedTime && totalElapsed > maxElapsedMillis) { +return BackoffStrategy.STOP; +} + +long waitMillis = nextWait; + +// calculate the next wait milliseconds +nextWait = Math.round(nextWait * backoffMultiple); + +// don't exceed the allowed maximum wait milliseconds +// if a maximum was configured +if(limitBackoffMillis && nextWait > maxBackoffMillis) { +nextWait = maxBackoffMillis; +} + +// track total elapsed time, even if we don't wait we have to assume +// that some amount of time passed outside of the wait or we'll never +// hit the elapsed time limit +totalElapsed += waitMillis != 0 ? waitMillis : 1L; +return waitMillis; +} + +@Override +public void reset() { +nextWait = this.initialBackoffMillis; +totalElapsed = 0; +} + +/** + * + * @return a new {@link Builder} instance. + */ +public static Builder builder() { +return new Builder(); +} + +/** + * Builder for instances of {@link ExponentialBackoffStrategy}. + */ +public static final class Builder { --- End diff -- So I probably wasn't clear here. It would be great if it was also possible, since backoff strategy is such a nicely defined interface, if the user could include jar with their own backoff strategy in the classpath and tell zookeeper to use that. What do you think? ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2939: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r152126049 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/StatCommandTest.java --- @@ -0,0 +1,106 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.zookeeper.server.ServerCnxn; +import org.apache.zookeeper.server.ServerCnxnFactory; +import org.apache.zookeeper.server.ServerStats; +import org.apache.zookeeper.server.ZKDatabase; +import org.apache.zookeeper.server.command.FourLetterCommands; +import org.apache.zookeeper.server.command.StatCommand; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.io.PrintWriter; +import java.io.StringWriter; +import java.util.ArrayList; +import java.util.List; + +import static org.hamcrest.core.StringContains.containsString; --- End diff -- According to: https://github.com/junit-team/junit4/wiki/matchers-and-assertthat we should probably use `import static org.hamcrest.CoreMatchers.containsString;` In addition, this will not resolve in intellij until we allow transitive test dependencies to be imported by changing build.xml:1820 ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2939: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r152110769 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/StatResetCommandTest.java --- @@ -0,0 +1,112 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.zookeeper.server.ServerCnxn; +import org.apache.zookeeper.server.ServerStats; +import org.apache.zookeeper.server.ZooKeeperServer; +import org.apache.zookeeper.server.command.StatResetCommand; +import org.junit.Before; +import org.junit.Test; + +import java.io.PrintWriter; +import java.io.StringWriter; + +import static org.junit.Assert.assertEquals; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class StatResetCommandTest { +private static final String ZK_NOT_SERVING = "This ZooKeeper instance is not currently serving requests\n"; --- End diff -- nit: maybe we should make the version of this in AbstractFourLetterCommand visible to this class for the purpose of testing. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2939: Added last/min/max proposal siz...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r152349015 --- Diff: build.xml --- @@ -1817,7 +1817,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> description="Create eclipse project files"> - + --- End diff -- nit: we can just remove the transitive property to be consistent ---
[GitHub] zookeeper issue #417: ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 i...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/417 @phunt fixed! ---
[GitHub] zookeeper issue #421: ZOOKEEPER-2944: Specify correct overflow value
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/421 I created ZOOKEEPER-2947 to track additional work in this area. ---
[GitHub] zookeeper pull request #280: ZOOKEEPER-2806: Flaky test: org.apache.zookeepe...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/280 ---
[GitHub] zookeeper pull request #281: ZOOKEEPER-2806: Flaky test: org.apache.zookeepe...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/281 ---
[GitHub] zookeeper pull request #422: [DO NOT COMMIT] trying to fix c unit tests
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/422 [DO NOT COMMIT] trying to fix c unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2948 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/422.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #422 commit eb52528c2a5868d8d266fe2d8a1ec34b908b8f84 Author: Abraham Fine Date: 2017-11-21T20:46:09Z [DO NOT COMMIT] trying to fix c unit tests ---
[GitHub] zookeeper issue #417: ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 i...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/417 @phunt i believe all of your concerns are resolved. ---
[GitHub] zookeeper pull request #417: ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/417#discussion_r153334347 --- Diff: ivy.xml --- @@ -90,6 +90,69 @@ + + + + --- End diff -- I reduced the number of excludes to the minimum that I think is necessary. Not sure why the ivy "stict" resolution change is not in 3.5. I'll investigate ---
[GitHub] zookeeper pull request #417: ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/417#discussion_r153334506 --- Diff: ivy.xml --- @@ -90,6 +90,69 @@ + + + + --- End diff -- And create a jira to make the branches the same if necessary ---
[GitHub] zookeeper pull request #424: ZOOKEEPER-236: SSL Support for Atomic Broadcast...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/424 ZOOKEEPER-236: SSL Support for Atomic Broadcast protocol This is the same thing as https://github.com/apache/zookeeper/pull/184 just targeting master You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-236_master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/424.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #424 commit f4032913dd90628ec852cb531c7d50ee80e4bcb0 Author: Abraham Fine Date: 2017-03-06T23:12:59Z ZOOKEEPER-236: SSL Support for Atomic Broadcast protocol ---
[GitHub] zookeeper pull request #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/420#discussion_r153636108 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -18,77 +18,58 @@ package org.apache.zookeeper.test; -import java.io.ByteArrayOutputStream; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.ArrayList; -import java.util.List; - -import org.apache.zookeeper.common.Time; -import org.apache.jute.BinaryInputArchive; -import org.apache.jute.BinaryOutputArchive; -import org.apache.jute.Record; + import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException.NoNodeException; -import org.apache.zookeeper.PortAssignment; -import org.apache.zookeeper.ZKTestCase; import org.apache.zookeeper.ZooDefs.Ids; -import org.apache.zookeeper.ZooDefs.OpCode; import org.apache.zookeeper.ZooKeeper; import org.apache.zookeeper.data.Stat; -import org.apache.zookeeper.server.DataNode; -import org.apache.zookeeper.server.DataTree; import org.apache.zookeeper.server.ServerCnxnFactory; import org.apache.zookeeper.server.SyncRequestProcessor; import org.apache.zookeeper.server.ZooKeeperServer; -import org.apache.zookeeper.server.persistence.FileHeader; import org.apache.zookeeper.server.persistence.FileTxnLog; import org.apache.zookeeper.server.persistence.FileTxnSnapLog; import org.apache.zookeeper.server.persistence.Util; import org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator; import org.apache.zookeeper.server.persistence.TxnLog.TxnIterator; -import org.apache.zookeeper.txn.CreateTxn; -import org.apache.zookeeper.txn.DeleteTxn; -import org.apache.zookeeper.txn.MultiTxn; -import org.apache.zookeeper.txn.Txn; import org.apache.zookeeper.txn.TxnHeader; +import org.eclipse.jetty.util.SocketAddressResolver; --- End diff -- nit: not sure this import is needed ---
[GitHub] zookeeper pull request #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/420#discussion_r153652218 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -18,77 +18,58 @@ package org.apache.zookeeper.test; -import java.io.ByteArrayOutputStream; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.ArrayList; -import java.util.List; - -import org.apache.zookeeper.common.Time; -import org.apache.jute.BinaryInputArchive; -import org.apache.jute.BinaryOutputArchive; -import org.apache.jute.Record; + import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException.NoNodeException; -import org.apache.zookeeper.PortAssignment; -import org.apache.zookeeper.ZKTestCase; import org.apache.zookeeper.ZooDefs.Ids; -import org.apache.zookeeper.ZooDefs.OpCode; import org.apache.zookeeper.ZooKeeper; import org.apache.zookeeper.data.Stat; -import org.apache.zookeeper.server.DataNode; -import org.apache.zookeeper.server.DataTree; import org.apache.zookeeper.server.ServerCnxnFactory; import org.apache.zookeeper.server.SyncRequestProcessor; import org.apache.zookeeper.server.ZooKeeperServer; -import org.apache.zookeeper.server.persistence.FileHeader; import org.apache.zookeeper.server.persistence.FileTxnLog; import org.apache.zookeeper.server.persistence.FileTxnSnapLog; import org.apache.zookeeper.server.persistence.Util; import org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator; import org.apache.zookeeper.server.persistence.TxnLog.TxnIterator; -import org.apache.zookeeper.txn.CreateTxn; -import org.apache.zookeeper.txn.DeleteTxn; -import org.apache.zookeeper.txn.MultiTxn; -import org.apache.zookeeper.txn.Txn; import org.apache.zookeeper.txn.TxnHeader; +import org.eclipse.jetty.util.SocketAddressResolver; +import org.junit.After; import org.junit.Assert; +import org.junit.Before; import org.junit.Test; import org.slf4j.Logger; import org.slf4j.LoggerFactory; -public class LoadFromLogTest extends ZKTestCase { -private static final String HOST = "127.0.0.1:"; -private static final int CONNECTION_TIMEOUT = 3000; +public class LoadFromLogTest extends ClientBase { private static final int NUM_MESSAGES = 300; protected static final Logger LOG = LoggerFactory.getLogger(LoadFromLogTest.class); // setting up the quorum has a transaction overhead for creating and closing the session private static final int TRANSACTION_OVERHEAD = 2; private static final int TOTAL_TRANSACTIONS = NUM_MESSAGES + TRANSACTION_OVERHEAD; +@Before +public void setUp() throws Exception { +SyncRequestProcessor.setSnapCount(100); --- End diff -- I think @phunt is mostly correct. There are a couple issues with `snapCount`. First, if want to change it while the syncProcessor thread has started, shouldn't it be volatile? Second, I disagree that it has "no effect" on a server that has already been started but the effect is not the desired behavior. Taking the issue of the `snapCount` field not being volatile out of the picture, the code in `SyncProcessor` looks like: ```java public void run() { try { int randRoll = r.nextInt(snapCount/2); while (true) { Request si = null; //get a request if (si != null) { // track the number of records written to the log if (zks.getZKDatabase().append(si)) { logCount++; if (logCount > (snapCount / 2 + randRoll)) { randRoll = r.nextInt(snapCount/2); ``` So my reading is that if `snapCount` is changed from `previousSnapCount` while the `SyncProcessor` is already running to `desiredSnapCount` it will take at least `desiredSnapCount` and at worst `desiredSnapCount + previousSnapCount/2` transactions for the "effective" `snapCount` to become `desiredSnapCount` (again ignoring the volatility issue). I agree with @phunt's solution. ---
[GitHub] zookeeper pull request #423: ZOOKEEPER-2949: using hostname and port to crea...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/423#discussion_r153658262 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java --- @@ -340,13 +340,24 @@ public static Packet getInstance() { return instance; } } + /** * ZKClientPipelineFactory is the netty pipeline factory for this netty * connection implementation. */ private class ZKClientPipelineFactory implements ChannelPipelineFactory { private SSLContext sslContext = null; private SSLEngine sslEngine = null; +private String host; +private int port; + +public ZKClientPipelineFactory() { --- End diff -- do we still need this constructor? ---
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/420 @phunt I think it's fine to always modify the `snapCount`. My concern was primarily modifying it once the zookeeper server was running created nondeterministic behavior. In addition, I don't think most of these tests make any assertions that could be impacted by the `snapCount`. So I think the current iteration of the tests is fine. Also, +1 on removing the lines instead of commenting them out and using a smaller snapCount. Otherwise lgtm. ---
[GitHub] zookeeper pull request #426: ZOOKEEPER-2915: Use "strict" conflict managemen...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/426 ZOOKEEPER-2915: Use "strict" conflict management in ivy Same as https://github.com/apache/zookeeper/pull/397 just targeting https://github.com/apache/zookeeper/pull/397 You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2915_3.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/426.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #426 commit d6d79561c9658a007f8e57222fefbfe0645a29e0 Author: Abraham Fine Date: 2017-11-30T21:30:30Z ZOOKEEPER-2915: Use "strict" conflict management in ivy ---
[GitHub] zookeeper pull request #425: Add keys for the Zxid from the stat command
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/425#discussion_r154221468 --- Diff: src/contrib/monitoring/check_zookeeper.py --- @@ -251,6 +256,13 @@ def _parse_stat(self, data): result['zk_znode_count'] = int(m.group(1)) continue +m = re.match('Zxid: (0x[0-9a-fA-F]+)', line) +if m is not None: +result['zk_zxid'] = int(m.group(1), 16) # convert from hex --- End diff -- Since the zxid is being split up below, it may be valuable to leave this in hex. ---
[GitHub] zookeeper pull request #425: Add keys for the Zxid from the stat command
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/425#discussion_r154221658 --- Diff: src/contrib/monitoring/check_zookeeper.py --- @@ -251,6 +256,13 @@ def _parse_stat(self, data): result['zk_znode_count'] = int(m.group(1)) continue +m = re.match('Zxid: (0x[0-9a-fA-F]+)', line) +if m is not None: +result['zk_zxid'] = int(m.group(1), 16) # convert from hex --- End diff -- Since the zxid is being split up below, it may be valuable to leave this in hex. ---
[GitHub] zookeeper pull request #425: Add keys for the Zxid from the stat command
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/425#discussion_r154221672 --- Diff: src/contrib/monitoring/check_zookeeper.py --- @@ -169,11 +169,16 @@ def __init__(self, host='localhost', port='2181', timeout=1): def get_stats(self): """ Get ZooKeeper server stats as a map """ data = self._send_cmd('mntr') +stat = self._parse_stat(self._send_cmd('stat')) if data: -return self._parse(data) +mntr = self._parse(data) +missing = ['zk_zxid', 'zk_zxid_counter', 'zk_zxid_epoch'] --- End diff -- why not just include all of the output from stat? ---
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/300 Thanks for taking a look @anmolnar! I'm confused by what you mean in item 3. Unless I am missing something, I don't understand how `FollowerZooKeeperServer` is "already running and keep receiving commits from the Leader including non-syncing ones" before `syncWithLeader` is completed. ---
[GitHub] zookeeper pull request #423: ZOOKEEPER-2949: using hostname and port to crea...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/423#discussion_r154445237 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java --- @@ -340,13 +340,24 @@ public static Packet getInstance() { return instance; } } + /** * ZKClientPipelineFactory is the netty pipeline factory for this netty * connection implementation. */ private class ZKClientPipelineFactory implements ChannelPipelineFactory { private SSLContext sslContext = null; private SSLEngine sslEngine = null; +private String host; +private int port; + +public ZKClientPipelineFactory() { --- End diff -- I don't think we need it here. ---
[GitHub] zookeeper pull request #426: ZOOKEEPER-2915: Use "strict" conflict managemen...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/426 ---
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/300 @anmolnar With respect to the code path above, shouldn't that be blocked on `syncWithLeader`? > Even if you drain the committedRequests, I'm not sure that guarantees that there are no more that will arrive. I'm not sure I understand how we don't have this guarantee. My understanding is that `syncWithLeader` loops until an `UPTODATE` message is received by the follower. Incoming packets from the leader are read by: ```java syncWithLeader(newEpochZxid); QuorumPacket qp = new QuorumPacket(); while (this.isRunning()) { readPacket(qp); processPacket(qp); } ``` In addition, my understanding is that requests are only added to `CommitProcessor`'s `committedRequests` in `processPacket`. What am I missing? ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r154786233 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -327,6 +257,95 @@ public void run() { LOG.info("CommitProcessor exited loop!"); } +private void processCommittedRequest() throws IOException, InterruptedException { +// In case of a spurious wakeup in waitForCommittedRequests we should not +// remove the request from the queue until it has been processed +Request request = committedRequests.peek(); + +if (request == null) { +committedRequests.poll(); --- End diff -- we don't ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r155102382 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -240,84 +240,14 @@ public void run() { } // Process committed head -if ((request = committedRequests.poll()) == null) { -throw new IOException("Error: committed head is null"); -} - -/* - * Check if request is pending, if so, update it with the committed info - */ -LinkedList sessionQueue = pendingRequests -.get(request.sessionId); -if (sessionQueue != null) { -// If session queue != null, then it is also not empty. -Request topPending = sessionQueue.poll(); -if (request.cxid != topPending.cxid) { -/* - * TL;DR - we should not encounter this scenario often under normal load. - * We pass the commit to the next processor and put the pending back with a warning. - * - * Generally, we can get commit requests that are not at the queue head after - * a session moved (see ZOOKEEPER-2684). Let's denote the previous server of the session - * with A, and the server that the session moved to with B (keep in mind that it is - * possible that the session already moved from B to a new server C, and maybe C=A). - * 1. If request.cxid < topPending.cxid : this means that the session requested this update - * from A, then moved to B (i.e., which is us), and now B receives the commit - * for the update after the session already performed several operations in B - * (and therefore its cxid is higher than that old request). - * 2. If request.cxid > topPending.cxid : this means that the session requested an updated - * from B with cxid that is bigger than the one we know therefore in this case we - * are A, and we lost the connection to the session. Given that we are waiting for a commit - * for that update, it means that we already sent the request to the leader and it will - * be committed at some point (in this case the order of cxid won't follow zxid, since zxid - * is an increasing order). It is not safe for us to delete the session's queue at this - * point, since it is possible that the session has newer requests in it after it moved - * back to us. We just leave the queue as it is, and once the commit arrives (for the old - * request), the finalRequestProcessor will see a closed cnxn handle, and just won't send a - * response. - * Also note that we don't have a local session, therefore we treat the request - * like any other commit for a remote request, i.e., we perform the update without sending - * a response. - */ -LOG.warn("Got request " + request + -" but we are expecting request " + topPending); -sessionQueue.addFirst(topPending); -} else { -/* - * Generally, we want to send to the next processor our version of the request, - * since it contains the session information that is needed for post update processing. - * In more details, when a request is in the local queue, there is (or could be) a client - * attached to this server waiting for a response, and there is other bookkeeping of - * requests that are outstanding and have originated from this server - * (e.g., for setting the max outstanding requests) - we need to update this info when an - * outstanding request completes. Note that in the other case (above), the operation
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/432 [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #432 commit 68a0382093da1d64583211746ac672ed12f5da5c Author: Abraham Fine Date: 2017-12-13T00:35:50Z ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/433 ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953_3.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/433.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #433 commit 36abecd0d4389b24e22852cccac50259b2abdb1a Author: Abraham Fine Date: 2017-12-13T00:51:52Z ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment ---
[GitHub] zookeeper pull request #434: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/434 [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953_3.4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #434 commit 4834ba25b9b0a9f0448675f96ca402120833f54e Author: Abraham Fine Date: 2017-12-13T01:10:21Z ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/432 ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
GitHub user afine reopened a pull request: https://github.com/apache/zookeeper/pull/432 [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #432 commit 68a0382093da1d64583211746ac672ed12f5da5c Author: Abraham Fine Date: 2017-12-13T00:35:50Z ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment ---
[GitHub] zookeeper pull request #433: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/433 ---
[GitHub] zookeeper pull request #433: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
GitHub user afine reopened a pull request: https://github.com/apache/zookeeper/pull/433 [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953_3.5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/433.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #433 commit 36abecd0d4389b24e22852cccac50259b2abdb1a Author: Abraham Fine Date: 2017-12-13T00:51:52Z ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment commit 5e4da2dcf3067008a33ace787c3d78c0bdbee823 Author: Abraham Fine Date: 2017-12-13T08:20:56Z fix whitespace ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156780239 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); --- End diff -- Fixed, forgot to put that in this branch. ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156781040 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); + + // find the leader + int trueLeader = -1; + for (int i = 0; i < numServers; i++) { +if (servers.mt[i].main.quorumPeer.leader != null) { + trueLeader = i; +} + } + Assert.assertTrue("There should be a leader", trueLeader >= 0); + + // find a follower + int falseLeader = (trueLeader + 1) % numServers; + Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null); + + // to keep the quorum peer running and force it to go into the looking state, we kill leader election + // and close the connection to the leader + servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown(); + servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close(); + + // wait for the falseLeader to disconnect + waitForOne(servers.zk[falseLeader], States.CONNECTING); + + // convince falseLeader that it is the leader + servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING); + + // provide time for the falseleader to realize no followers have connected + // (this is twice the timeout used in Leader#getEpochToPropose) + Thread.sleep(2 * servers.mt[falseLeader].main.quorumPeer.initLimit * servers.mt[falseLeader].main.quorumPeer.tickTime); + + // Restart leader election + servers.mt[falseLeader].main.quorumPeer.startLeaderElection(); --- End diff -- Stopping and starting leader election is necessary here to prevent a race condition. It is possible that after the server is disconnected from the leader it becomes a follower before the test hits `servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING);` and falseLeader will never try to `lead`, defeating the purpose of the test. ---
[GitHub] zookeeper issue #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeade...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/432 @anmolnar An explanation is coming, please excuse the cruft at this stage. I wanted to run the new test on apache hardware hence the [WIP] in the title. ---
[GitHub] zookeeper pull request #432: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156999341 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); + + // find the leader + int trueLeader = -1; + for (int i = 0; i < numServers; i++) { +if (servers.mt[i].main.quorumPeer.leader != null) { + trueLeader = i; +} + } + Assert.assertTrue("There should be a leader", trueLeader >= 0); + + // find a follower + int falseLeader = (trueLeader + 1) % numServers; + Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null); --- End diff -- fixed ---
[GitHub] zookeeper issue #432: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstab...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/432 [WIP] removed and I added a description of the issue to the JIRA. ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/433#discussion_r157004909 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); --- End diff -- fixed ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/433#discussion_r157004899 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; --- End diff -- fixed ---
[GitHub] zookeeper pull request #434: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/434 ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user afine closed the pull request at: https://github.com/apache/zookeeper/pull/433 ---
[GitHub] zookeeper pull request #436: [WIP] ZOOKEEPER-2249: CRC check failed when pre...
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/436 [WIP] ZOOKEEPER-2249: CRC check failed when preAllocSize smaller than node data You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2249 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #436 commit 99236e2ddb10de6d20e5afea7e79751404ae48c6 Author: Abraham Fine Date: 2017-12-19T02:33:11Z ZOOKEEPER-2249: CRC check failed when preAllocSize smaller than node data ---
[GitHub] zookeeper pull request #436: ZOOKEEPER-2249: CRC check failed when preAllocS...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/436#discussion_r158392075 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/Util.java --- @@ -211,7 +211,7 @@ public static long padLogFile(FileOutputStream f,long currentSize, long preAllocSize) throws IOException{ long position = f.getChannel().position(); if (position + 4096 >= currentSize) { -currentSize = currentSize + preAllocSize; +currentSize = position + preAllocSize; --- End diff -- Fixed ---
[GitHub] zookeeper pull request #437: ZOOKEEPER-2961: Fix testElectionFraud Flakyness
GitHub user afine opened a pull request: https://github.com/apache/zookeeper/pull/437 ZOOKEEPER-2961: Fix testElectionFraud Flakyness You can merge this pull request into a Git repository by running: $ git pull https://github.com/afine/zookeeper ZOOKEEPER-2961_master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/437.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #437 commit f0a64245102eda07da53bf2842620bde61146a49 Author: Abraham Fine Date: 2017-12-22T22:12:26Z ZOOKEEPER-2961: Fix testElectionFraud Flakyness ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r159985176 --- Diff: build.xml --- @@ -198,7 +198,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> - + --- End diff -- what is the reason for the downgrade? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160236050 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderBeanTest.java --- @@ -0,0 +1,119 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.jute.OutputArchive; --- End diff -- nit: there are a couple unused imports here ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160249680 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/ProposalStats.java --- @@ -0,0 +1,70 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import com.codahale.metrics.ExponentiallyDecayingReservoir; +import com.codahale.metrics.Histogram; +import com.codahale.metrics.JmxReporter; +import com.codahale.metrics.MetricRegistry; +import com.codahale.metrics.Reservoir; +import com.codahale.metrics.Snapshot; +import org.apache.zookeeper.jmx.CommonNames; +import org.apache.zookeeper.jmx.MBeanRegistry; + +import static com.codahale.metrics.MetricRegistry.name; + +/** + * Provides real-time metrics on Leader's proposal size. + * The class uses a histogram included in Dropwizard metrics library with ExponentiallyDecayingReservoir. + * It provides stats of proposal sizes from the last 5 minutes with acceptable cpu/memory footprint optimized for streaming data. + */ +public class ProposalStats { +private final Histogram proposalSizes; + +ProposalStats() { +final MetricRegistry metrics = new MetricRegistry(); +Reservoir reservoir = new ExponentiallyDecayingReservoir(); --- End diff -- Perhaps a user configurable SlidingTimeWindowReservoir may be more appropriate? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r159986909 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/ProposalStats.java --- @@ -0,0 +1,70 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import com.codahale.metrics.ExponentiallyDecayingReservoir; --- End diff -- I don't see much being pulled in from dropwizard, only codahale. Is codahale.metrics available independently? It doesn't look like it but I only checked briefly. ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160236436 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderBeanTest.java --- @@ -0,0 +1,119 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.jute.OutputArchive; +import org.apache.jute.Record; +import org.apache.zookeeper.server.Request; +import org.apache.zookeeper.server.ZKDatabase; +import org.apache.zookeeper.server.persistence.FileTxnSnapLog; +import org.apache.zookeeper.server.quorum.flexible.QuorumVerifier; +import org.apache.zookeeper.server.util.SerializeUtils; +import org.apache.zookeeper.test.ClientBase; +import org.apache.zookeeper.txn.TxnHeader; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.stubbing.Answer; + +import java.io.File; +import java.io.IOException; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.doAnswer; +import static org.mockito.Mockito.mock; + +public class LeaderBeanTest { +private Leader leader; +private LeaderBean leaderBean; +private FileTxnSnapLog fileTxnSnapLog; +private LeaderZooKeeperServer zks; +private QuorumPeer qp; + +@Before +public void setUp() throws IOException { +qp = new QuorumPeer(); +QuorumVerifier quorumVerifierMock = mock(QuorumVerifier.class); +qp.setQuorumVerifier(quorumVerifierMock, false); +File tmpDir = ClientBase.createEmptyTestDir(); +fileTxnSnapLog = new FileTxnSnapLog(new File(tmpDir, "data"), +new File(tmpDir, "data_txnlog")); +ZKDatabase zkDb = new ZKDatabase(fileTxnSnapLog); + +zks = new LeaderZooKeeperServer(fileTxnSnapLog, qp, zkDb); +leader = new Leader(qp, zks); +leaderBean = new LeaderBean(leader, zks); +} + +@After +public void tearDown() throws IOException { +fileTxnSnapLog.close(); +} + +@Test +public void testGetName() { +assertEquals("Leader", leaderBean.getName()); +} + +@Test +public void testGetCurrentZxid() { +// Arrange +zks.setZxid(1); + +// Assert +assertEquals("0x1", leaderBean.getCurrentZxid()); +} + +@Test +public void testGetElectionTimeTaken() { +// Arrange +qp.setElectionTimeTaken(1); + +// Assert +assertEquals(1, leaderBean.getElectionTimeTaken()); +} + +private Request createMockRequest() throws IOException { --- End diff -- is this ever used? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r159987502 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/ProposalStats.java --- @@ -0,0 +1,70 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import com.codahale.metrics.ExponentiallyDecayingReservoir; +import com.codahale.metrics.Histogram; +import com.codahale.metrics.JmxReporter; +import com.codahale.metrics.MetricRegistry; +import com.codahale.metrics.Reservoir; +import com.codahale.metrics.Snapshot; +import org.apache.zookeeper.jmx.CommonNames; +import org.apache.zookeeper.jmx.MBeanRegistry; + +import static com.codahale.metrics.MetricRegistry.name; + +/** + * Provides real-time metrics on Leader's proposal size. + * The class uses a histogram included in Dropwizard metrics library with ExponentiallyDecayingReservoir. + * It provides stats of proposal sizes from the last 5 minutes with acceptable cpu/memory footprint optimized for streaming data. + */ +public class ProposalStats { +private final Histogram proposalSizes; + +ProposalStats() { +final MetricRegistry metrics = new MetricRegistry(); +Reservoir reservoir = new ExponentiallyDecayingReservoir(); --- End diff -- I won't pretend to know much about exponentially decaying reservoirs. I'm curious what the behavior is with minimum and maximum values. Are these guaranteed to always be exact and for what time period? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160233203 --- Diff: ivy.xml --- @@ -133,6 +133,12 @@ + + +
[GitHub] zookeeper issue #439: ZOOKEEPER-1621: Delete and skip txn log with incomplet...
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/439 @abhishekrai Looking through the JIRA I found: > This has been a recurring problem for us in production because our app's operating environment occasionally causes a Zookeeper server's disk to become full. After that, the server invariably runs into this problem - perhaps because there's something else that deterministically triggers a log rotation when the previous txn log throws an IOException due to disk full? Do we have evidence that the log roll is being triggered "deterministically"? It would be great to know for sure that we are handling the disk filling up appropriately all the time rather than just a work around for special cases. ---
[GitHub] zookeeper pull request #438: ZOOKEEPER-2959: ignore accepted epoch and LEADE...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/438#discussion_r160286100 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Leader.java --- @@ -1183,8 +1183,10 @@ public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws Interrupt if (lastAcceptedEpoch >= epoch) { epoch = lastAcceptedEpoch+1; } -connectingFollowers.add(sid); QuorumVerifier verifier = self.getQuorumVerifier(); +if(verifier.getVotingMembers().containsKey(sid)) { --- End diff -- I'm wondering if this logic is best suited for the `QuorumVerifier`. In other words, the quorum verifier should be able to determine if a quorum is present from a set of ids while taking into account which sids represent voting members. ---
[GitHub] zookeeper issue #437: ZOOKEEPER-2961: Fix testElectionFraud Flakyness
Github user afine commented on the issue: https://github.com/apache/zookeeper/pull/437 Hi @anmolnar - I had a description in the JIRA that I just copied here. I hope it clears things up. ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160483835 --- Diff: build.xml --- @@ -198,7 +198,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> - + --- End diff -- So do we still need this lower version? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160889741 --- Diff: build.xml --- @@ -198,7 +198,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> - + --- End diff -- Can we just exclude org.slf4j#slf4j-api;1.7.22 ? ---
[GitHub] zookeeper pull request #443: ZOOKEEPER-2955: Enable Clover code coverage rep...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/443#discussion_r163372311 --- Diff: build.xml --- @@ -23,6 +23,48 @@ xmlns:artifact="antlib:org.apache.maven.artifact.ant" xmlns:maven="antlib:org.apache.maven.artifact.ant" xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> + --- End diff -- why was this stuff moved? ---
[GitHub] zookeeper pull request #443: ZOOKEEPER-2955: Enable Clover code coverage rep...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/443#discussion_r163373344 --- Diff: build.xml --- @@ -1406,50 +1410,53 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> + + + + + + + - + --- End diff -- we should update this description to show that we have an ant task that can be used instead of specifying a system property ---
[GitHub] zookeeper pull request #443: ZOOKEEPER-2955: Enable Clover code coverage rep...
Github user afine commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/443#discussion_r163377320 --- Diff: build.xml --- @@ -124,6 +160,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> + --- End diff -- My concern here is around classpath issues. Ideally I would like to make sure that clover and its dependencies (my understanding is that there are none currently but this could change) are only included when we are instrumenting coverage. The given setup may see us including clover in the test classpath even when not intended. For example, a developer wants to run the tests with clover and then immediately after without. If my understanding is correct, it there is no `clean` before those two executions clover will be included in the classpath of the second. ---