[jira] [Commented] (ZOOKEEPER-3129) Improve ZK Client resiliency by throwing a jute.maxbuffer size exception before sending a request to server

2018-09-06 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606731#comment-16606731
 ] 

Karan Mehta commented on ZOOKEEPER-3129:


Yes, a new property only for client side.

> Improve ZK Client resiliency by throwing a jute.maxbuffer size exception 
> before sending a request to server
> ---
>
> Key: ZOOKEEPER-3129
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3129
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Priority: Major
>
> Zookeeper is mostly operated in controlled environments and the client/server 
> properties are usually known. With this Jira, I would like to propose a new 
> property on client side that represents the max jute buffer size server is 
> going to accept.
> On the ZKClient, in case of multi Op, the request is serialized and hence we 
> know the size of complete packet that will be sent. We can use this new 
> property to determine if the we are exceeding the limit and throw some form 
> of KeeperException. This would be fail fast mechanism and the application can 
> potentially retry by chunking up the request or serializing.
> Since the same properties are now present in two locations, over time, two 
> possibilities can happen.
> -- Server jutebuffer accepts value is more than what is specified on client 
> side
> The application might end up serializing it or zkclient can be made 
> configurable to retry even when it gets this exception
> -- Server jutebuffer accepts value is lower than what is specified on client 
> side
> That would have failed previously as well, so there is no change in behavior
> This would help silent failures like HBASE-18549 getting avoided. 
> Thoughts [~apurtell] [~xucang] [~anmolnar] [~hanm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #605: [ZOOKEEPER-3125] Fixing pzxid consistent issue ...

2018-09-06 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/605#discussion_r215849976
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/FuzzySnapshotRelatedTest.java 
---
@@ -0,0 +1,254 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.quorum;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import javax.security.sasl.SaslException;
+
+import org.apache.jute.OutputArchive;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooKeeper.States;
+import org.apache.zookeeper.data.Stat;
+import org.apache.zookeeper.server.DataNode;
+import org.apache.zookeeper.server.DataTree;
+import org.apache.zookeeper.server.ZKDatabase;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.test.ClientBase;
+
+import org.junit.Assert;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Test cases used to catch corner cases due to fuzzy snapshot.
+ */
+public class FuzzySnapshotRelatedTest extends QuorumPeerTestBase {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(FuzzySnapshotRelatedTest.class);
+
+MainThread[] mt = null;
+ZooKeeper[] zk = null;
+int leaderId;
+int followerA;
+
+@Before
+public void setup() throws Exception {
--- End diff --

The main difference is that I need to pass in Customized QuorumPeer to hook 
events in ZK, there might be something in common, like how we generate the 
connection string, we may do some refactor later, but I prefer to do that 
separately since we also need to touch other tests who may have similar logics.


---


[GitHub] zookeeper issue #184: ZOOKEEPER-236: SSL Support for Atomic Broadcast protoc...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/184
  
@hanm Sure thing. If you look back the comment history, you'll see that he 
had a lot of comments, suggestions which I addressed and discussed. This PR is 
now basically a joint effort of @afine  @ivmaykov and myself.

We also talked about further new features and improvements which will be 
collected in a separate Jira, but this one needs to be merged first.


---


[jira] [Commented] (ZOOKEEPER-2977) Concurrency for addAuth corrupts quorum packets

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606682#comment-16606682
 ] 

Hadoop QA commented on ZOOKEEPER-2977:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12911964/2977.patch
  against trunk revision 27db6bd28e3297cdafe19c204b42c3db20c45ad6.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3704//console

This message is automatically generated.

> Concurrency for addAuth corrupts quorum packets
> ---
>
> Key: ZOOKEEPER-2977
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2977
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.9
> Environment: Affects all version in 3.4.x
>Reporter: sumit agrawal
>Assignee: sumit agrawal
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 2977.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When client performs multiple times addAuth with different credential at 
> follower concurrently, the communication between follower gets corrupt. This 
> causes shutdown of Follower due to the failure.
> Analysis:
> In org.apache.zookeeper.server.quorum.QuorumPacket.serialize method,
>  * call a_.startVector(authinfo,"authinfo"); which write the length of 
> authinfo to packet (suppose it writes length 1)
>  * get length of authinfo to write all details in loop (here gets length as 2)
> <-- Here in concurrency scenario, buffer gets corrupt having extra bytes in 
> channel for additional authinfo.
>  
> So When Leader reads next quorum packet, it reads previous extra bytes 
> (incorrect) and possibly identify greater size of message (as corrupt byte 
> pattern) causes exception...
> Coordination > Unexpected exception causing shutdown while sock still open 
> (LearnerHandler.java:633)
>  java.io.IOException: Unreasonable length = 1885430131
>  
>  
> ServerCnxn.getAuthInfo returns Unmodifiable list, but while addAuthInfo, 
> there is no check. So this causes concurrency issue.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #474: ZOOKEEPER-2977: Concurrency for addAuth corrupts quoru...

2018-09-06 Thread maoling
Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/474
  
@sumitagrawl Are you still working on this?
or someone else can pick up this?


---


回复:Re: Re: ZooKeeper 3.5 blocker issues

2018-09-06 Thread 岭秀
waiting for the release of branch3.5 at end of this year!(grin)
- 原始邮件 -
发件人:Michael Han 
收件人:DevZooKeeper 
主题:Re: Re: ZooKeeper 3.5 blocker issues
日期:2018年09月07日 09点24分

I haven't went through the entire list, but looks like lots of the JIRA
issues listed in this thread, such as ZOOKEEPER-1549, 2846, also affects
3.4 releases. Should we scope these issues out?
I think historically the single outstanding blocking issue for a stable 3.5
release is the reconfig feature and security concerns around it (somehow
addressed in ZOOKEEPER-2014), and the alpha and beta releases were created
to stabilize that feature.
http://zookeeper-user.578899.n2.nabble.com/Zookeeper-with-SSL-release-date-tt7581744.html
So it looks like we are in good shape to release. Something might worth
doing to claim the quality of 3.5 is on par with 3.4
* Run Jepsen on 3.5 - 3.4 passed the test for the record
https://aphyr.com/posts/291-jepsen-zookeeper
* Fix all flaky tests on 3.5 - 3.4 has little or no flaky tests at all.
On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar 
wrote:
> Thanks Maoling! That would be huge help, I appreciate it.
>
> Andor
>


[jira] [Commented] (ZOOKEEPER-3131) org.apache.zookeeper.server.WatchManager resource leak

2018-09-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606636#comment-16606636
 ] 

Hudson commented on ZOOKEEPER-3131:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #180 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/180/])
ZOOKEEPER-3131: Remove watcher when session closed in NettyServerCnxn (hanm: 
rev 95557a30edbdfdf4479a1cb142e0d82a4ba6061d)
* (edit) src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java
* (edit) src/java/test/org/apache/zookeeper/server/NettyServerCnxnTest.java


> org.apache.zookeeper.server.WatchManager resource leak
> --
>
> Key: ZOOKEEPER-3131
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3131
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.5.4, 3.6.0
> Environment: -Xmx512m 
>Reporter: ChaoWang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In some cases, the variable _watch2Paths_ in _Class WatchManager_ does not 
> remove the entry, even if the associated value "HashSet" is empty already. 
> The type of key in Map _watch2Paths_ is Watcher, instance of 
> _NettyServerCnxn._ If it is not removed when the associated set of paths is 
> empty, it will cause the memory increases little by little, and 
> OutOfMemoryError triggered finally. 
>  
> {color:#FF}*Possible Solution:*{color}
> In the following function, the logic should be added to remove the entry.
> org.apache.zookeeper.server.WatchManager#removeWatcher(java.lang.String, 
> org.apache.zookeeper.Watcher)
> if (paths.isEmpty())
> { watch2Paths.remove(watcher); }
> For the following function as well:
> org.apache.zookeeper.server.WatchManager#triggerWatch(java.lang.String, 
> org.apache.zookeeper.Watcher.Event.EventType, 
> java.util.Set)
>  
> Please confirm this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3136) Reduce log in ClientBase in case of ConnectException

2018-09-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606637#comment-16606637
 ] 

Hudson commented on ZOOKEEPER-3136:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #180 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/180/])
ZOOKEEPER-3136: Reduce log in ClientBase in case of ConnectException (hanm: rev 
27db6bd28e3297cdafe19c204b42c3db20c45ad6)
* (edit) src/java/test/org/apache/zookeeper/test/ClientBase.java


> Reduce log in ClientBase in case of ConnectException
> 
>
> Key: ZOOKEEPER-3136
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3136
> Project: ZooKeeper
>  Issue Type: Task
>  Components: tests
>Reporter: Enrico Olivelli
>Assignee: Enrico Olivelli
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While running tests you will always see spammy log lines like the ones below.
> As we are expecting the server to be up, it is not useful to log such 
> stacktraces.
> The patch simply reduce the log in this specific case, because it adds no 
> value and it is very annoying.
>  
> {code:java}
>  [junit] 2018-08-31 23:31:49,173 [myid:] - INFO  [main:ClientBase@292] - 
> server 127.0.0.1:11222 not up
>     [junit] java.net.ConnectException: Connection refused (Connection refused)
>     [junit]     at java.net.PlainSocketImpl.socketConnect(Native Method)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>     [junit]     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>     [junit]     at java.net.Socket.connect(Socket.java:589)
>     [junit]     at 
> org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:101)
>     [junit]     at 
> org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:71)
>     [junit]     at 
> org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:285)
>     [junit]     at 
> org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:276)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3129) Improve ZK Client resiliency by throwing a jute.maxbuffer size exception before sending a request to server

2018-09-06 Thread Michael Han (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606608#comment-16606608
 ] 

Michael Han commented on ZOOKEEPER-3129:


I think it's a good ida. I am leaning towards using a new property for this. I 
think this will be a client only property, correct? 

> Improve ZK Client resiliency by throwing a jute.maxbuffer size exception 
> before sending a request to server
> ---
>
> Key: ZOOKEEPER-3129
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3129
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Priority: Major
>
> Zookeeper is mostly operated in controlled environments and the client/server 
> properties are usually known. With this Jira, I would like to propose a new 
> property on client side that represents the max jute buffer size server is 
> going to accept.
> On the ZKClient, in case of multi Op, the request is serialized and hence we 
> know the size of complete packet that will be sent. We can use this new 
> property to determine if the we are exceeding the limit and throw some form 
> of KeeperException. This would be fail fast mechanism and the application can 
> potentially retry by chunking up the request or serializing.
> Since the same properties are now present in two locations, over time, two 
> possibilities can happen.
> -- Server jutebuffer accepts value is more than what is specified on client 
> side
> The application might end up serializing it or zkclient can be made 
> configurable to retry even when it gets this exception
> -- Server jutebuffer accepts value is lower than what is specified on client 
> side
> That would have failed previously as well, so there is no change in behavior
> This would help silent failures like HBASE-18549 getting avoided. 
> Thoughts [~apurtell] [~xucang] [~anmolnar] [~hanm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Re: ZooKeeper 3.5 blocker issues

2018-09-06 Thread Michael Han
I haven't went through the entire list, but looks like lots of the JIRA
issues listed in this thread, such as ZOOKEEPER-1549, 2846, also affects
3.4 releases. Should we scope these issues out?

I think historically the single outstanding blocking issue for a stable 3.5
release is the reconfig feature and security concerns around it (somehow
addressed in ZOOKEEPER-2014), and the alpha and beta releases were created
to stabilize that feature.

http://zookeeper-user.578899.n2.nabble.com/Zookeeper-with-SSL-release-date-tt7581744.html

So it looks like we are in good shape to release. Something might worth
doing to claim the quality of 3.5 is on par with 3.4

* Run Jepsen on 3.5 - 3.4 passed the test for the record
https://aphyr.com/posts/291-jepsen-zookeeper
* Fix all flaky tests on 3.5 - 3.4 has little or no flaky tests at all.


On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar 
wrote:

> Thanks Maoling! That would be huge help, I appreciate it.
>
> Andor
>


[GitHub] zookeeper issue #184: ZOOKEEPER-236: SSL Support for Atomic Broadcast protoc...

2018-09-06 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/184
  
Sorry for lag, I am resuming my review on this. 

On a side note - I remember @ivmaykov mentioned to me offline that he made 
a couple of security improvements (related to port unification IIRC). Would be 
good to get his feedback as well.


---


[GitHub] zookeeper issue #614: ZOOKEEPER-3136 Reduce log in ClientBase in case of Con...

2018-09-06 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/614
  
merged to master, thanks @eolivelli 


---


[jira] [Resolved] (ZOOKEEPER-3136) Reduce log in ClientBase in case of ConnectException

2018-09-06 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han resolved ZOOKEEPER-3136.

Resolution: Fixed

Issue resolved by pull request 614
[https://github.com/apache/zookeeper/pull/614]

> Reduce log in ClientBase in case of ConnectException
> 
>
> Key: ZOOKEEPER-3136
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3136
> Project: ZooKeeper
>  Issue Type: Task
>  Components: tests
>Reporter: Enrico Olivelli
>Assignee: Enrico Olivelli
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While running tests you will always see spammy log lines like the ones below.
> As we are expecting the server to be up, it is not useful to log such 
> stacktraces.
> The patch simply reduce the log in this specific case, because it adds no 
> value and it is very annoying.
>  
> {code:java}
>  [junit] 2018-08-31 23:31:49,173 [myid:] - INFO  [main:ClientBase@292] - 
> server 127.0.0.1:11222 not up
>     [junit] java.net.ConnectException: Connection refused (Connection refused)
>     [junit]     at java.net.PlainSocketImpl.socketConnect(Native Method)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>     [junit]     at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>     [junit]     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>     [junit]     at java.net.Socket.connect(Socket.java:589)
>     [junit]     at 
> org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:101)
>     [junit]     at 
> org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:71)
>     [junit]     at 
> org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:285)
>     [junit]     at 
> org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:276)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #614: ZOOKEEPER-3136 Reduce log in ClientBase in case...

2018-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/614


---


[GitHub] zookeeper issue #611: ZOOKEEPER-3131

2018-09-06 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/611
  
#612 is now merged. @wangchaod Do you know what was the root cause of your 
OOM issue? Were you using Netty instead of NIO? Would #612 fix your case? I am 
curious if #612 does not fix your case because that implies that you get a lot 
of connections with empty watchers and I am expecting you reach connection 
limit even before those watchers can cause OOM.


---


[jira] [Commented] (ZOOKEEPER-3124) Add the correct comment to show why we need the special logic to handle cversion and pzxid

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606573#comment-16606573
 ] 

Hadoop QA commented on ZOOKEEPER-3124:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2131//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2131//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2131//console

This message is automatically generated.

> Add the correct comment to show why we need the special logic to handle 
> cversion and pzxid
> --
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The old comment about setCversionPzxid is not valid, the scenario it 
> mentioned won't trigger the issue, update it to show the exact reason.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ZOOKEEPER-3131) org.apache.zookeeper.server.WatchManager resource leak

2018-09-06 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han resolved ZOOKEEPER-3131.

   Resolution: Fixed
Fix Version/s: 3.6.0

Issue resolved by pull request 612
[https://github.com/apache/zookeeper/pull/612]

> org.apache.zookeeper.server.WatchManager resource leak
> --
>
> Key: ZOOKEEPER-3131
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3131
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3, 3.5.4, 3.6.0
> Environment: -Xmx512m 
>Reporter: ChaoWang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In some cases, the variable _watch2Paths_ in _Class WatchManager_ does not 
> remove the entry, even if the associated value "HashSet" is empty already. 
> The type of key in Map _watch2Paths_ is Watcher, instance of 
> _NettyServerCnxn._ If it is not removed when the associated set of paths is 
> empty, it will cause the memory increases little by little, and 
> OutOfMemoryError triggered finally. 
>  
> {color:#FF}*Possible Solution:*{color}
> In the following function, the logic should be added to remove the entry.
> org.apache.zookeeper.server.WatchManager#removeWatcher(java.lang.String, 
> org.apache.zookeeper.Watcher)
> if (paths.isEmpty())
> { watch2Paths.remove(watcher); }
> For the following function as well:
> org.apache.zookeeper.server.WatchManager#triggerWatch(java.lang.String, 
> org.apache.zookeeper.Watcher.Event.EventType, 
> java.util.Set)
>  
> Please confirm this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #612: [ZOOKEEPER-3131] Remove watcher when session cl...

2018-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/612


---


[GitHub] zookeeper issue #612: [ZOOKEEPER-3131] Remove watcher when session closed in...

2018-09-06 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/612
  
I just triggered a build through "Rebuild", it's passing. So I am 
committing this patch.
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2130/

I am not sure why the result is not integrated in Jenkins check on github. 
I think this issue was reported before.


---


[GitHub] zookeeper issue #545: ZOOKEEPER-2261 When only secureClientPort is configure...

2018-09-06 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/545
  
I suggest we move test refactoring part of this patch to a separate JIRA / 
PR. My main concern is mix these will create confusions for those who check the 
commit history of this patch and wondering why those test related changes were 
made. 

Other than this, the patch look good to me. 


---


[jira] [Updated] (ZOOKEEPER-3124) Add the correct comment to show why we need the special logic to handle cversion and pzxid

2018-09-06 Thread Fangmin Lv (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv updated ZOOKEEPER-3124:
--
Description: 
The old comment about setCversionPzxid is not valid, the scenario it mentioned 
won't trigger the issue, update it to show the exact reason.
  

  was:
There is special logic in the DataTree.processTxn to handle the NODEEXISTS when 
createNode, which is used to handle the cversion and pzxid not being updated 
due to fuzzy snapshot: 

https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
 

But is this a real issue, or is it still an issue for now?

In the current code, when serializing a parent node, we'll lock on it, and take 
a children snapshot at that time. If the child added after the parent is 
serialized to disk, then it won't be written out, so we shouldn't hit the issue 
where the child is in the snapshot but parent cversion and pzxid is not changed.

 
 
I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
issue as well, I'm not sure why we added this, am I missing anything? Can we 
just get rid of it?
 


> Add the correct comment to show why we need the special logic to handle 
> cversion and pzxid
> --
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The old comment about setCversionPzxid is not valid, the scenario it 
> mentioned won't trigger the issue, update it to show the exact reason.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-3124) Add the correct comment to show why we need the special logic to handle cversion and pzxid

2018-09-06 Thread Fangmin Lv (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv updated ZOOKEEPER-3124:
--
Summary: Add the correct comment to show why we need the special logic to 
handle cversion and pzxid  (was: Update the comment of special logic to handle 
cversion and pzxid in DataTree.processTxn)

> Add the correct comment to show why we need the special logic to handle 
> cversion and pzxid
> --
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
> when createNode, which is used to handle the cversion and pzxid not being 
> updated due to fuzzy snapshot: 
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
>  
> But is this a real issue, or is it still an issue for now?
> In the current code, when serializing a parent node, we'll lock on it, and 
> take a children snapshot at that time. If the child added after the parent is 
> serialized to disk, then it won't be written out, so we shouldn't hit the 
> issue where the child is in the snapshot but parent cversion and pzxid is not 
> changed.
>  
>  
> I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
> issue as well, I'm not sure why we added this, am I missing anything? Can we 
> just get rid of it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #567: ZOOKEEPER-3071: Add a config parameter to control tran...

2018-09-06 Thread suyogmapara
Github user suyogmapara commented on the issue:

https://github.com/apache/zookeeper/pull/567
  

Hi @maoling 

I tested the patch with following parameters:

1. PreAllocSize = 100 KB, TxnLogSizeLimit = 200 KB, 100 transactions of 
little over 10 KBs. Outcome: 5 transaction files with little over 200 KBs each 
(rounded up to last transaction).
2. PreAllocSize = 100 KB, TxnLogSizeLimit = 100 KB, 100 transactions of 
little over 10 KBs. Outcome: 100 transaction files one per each transaction
3. PreAllocSize = 100 KB, TxnLogSizeLimit = 50 KB, 100 transactions of 
little over 10 KBs. Outcome: 100 transaction files one per each transaction.

Second and third outcomes are expected as every time it tries to 
pre-allocate more, we would be rolling the log. Of course those won't be 
sensible configurations, but if someone uses them, only performance would be 
impacted and not the correctness. Please let me know if you disagree.

@anmolnar 
snapCount doesn't give direct handle on the transaction log size. Changing 
average transaction size from 1 KB to 100 KB would increase the transaction log 
size by 2 orders of magnitude and may push they sync time over the edge. More 
importantly reducing the snapCount would also increase how frequently we 
snapshot and so we would be forced to find a sweet spot to balance the two 
conflicting goals. On the other hand, txnLogSizeLimit provides cheap and 
independent control on the worst case seek time during the diff sync.


---


[jira] [Commented] (ZOOKEEPER-2261) When only secureClientPort is configured connections, configuration, connection_stat_reset, and stats admin commands throw NullPointerException

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606361#comment-16606361
 ] 

Hadoop QA commented on ZOOKEEPER-2261:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 17 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2129//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2129//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2129//console

This message is automatically generated.

> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset, and stats admin commands throw NullPointerException
> ---
>
> Key: ZOOKEEPER-2261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2261
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Mohammad Arshad
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0, 3.5.5
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset and stats admin commands throw NullPointerException. 
> Here is stack trace one of the connections command.
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.zookeeper.server.admin.Commands$ConsCommand.run(Commands.java:177)
>   at 
> org.apache.zookeeper.server.admin.Commands.runCommand(Commands.java:92)
>   at 
> org.apache.zookeeper.server.admin.JettyAdminServer$CommandServlet.doGet(JettyAdminServer.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2261) When only secureClientPort is configured connections, configuration, connection_stat_reset, and stats admin commands throw NullPointerException

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606346#comment-16606346
 ] 

Hadoop QA commented on ZOOKEEPER-2261:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2128//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2128//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2128//console

This message is automatically generated.

> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset, and stats admin commands throw NullPointerException
> ---
>
> Key: ZOOKEEPER-2261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2261
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Mohammad Arshad
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0, 3.5.5
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset and stats admin commands throw NullPointerException. 
> Here is stack trace one of the connections command.
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.zookeeper.server.admin.Commands$ConsCommand.run(Commands.java:177)
>   at 
> org.apache.zookeeper.server.admin.Commands.runCommand(Commands.java:92)
>   at 
> org.apache.zookeeper.server.admin.JettyAdminServer$CommandServlet.doGet(JettyAdminServer.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #545: ZOOKEEPER-2261 When only secureClientPort is configure...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/545
  
@hanm @enixon Back to the roots: created separate getter for the secure 
server cnxn factory and fixed the logic of Cons command handler.

Additionally some unit test refactor (move FileTxnLogTests to the right 
place + indentation fixes).

Too much...?


---


[jira] [Commented] (ZOOKEEPER-2261) When only secureClientPort is configured connections, configuration, connection_stat_reset, and stats admin commands throw NullPointerException

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606331#comment-16606331
 ] 

Hadoop QA commented on ZOOKEEPER-2261:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2127//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2127//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2127//console

This message is automatically generated.

> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset, and stats admin commands throw NullPointerException
> ---
>
> Key: ZOOKEEPER-2261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2261
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Mohammad Arshad
>Assignee: Andor Molnar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0, 3.5.5
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When only secureClientPort is configured connections, configuration, 
> connection_stat_reset and stats admin commands throw NullPointerException. 
> Here is stack trace one of the connections command.
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.zookeeper.server.admin.Commands$ConsCommand.run(Commands.java:177)
>   at 
> org.apache.zookeeper.server.admin.Commands.runCommand(Commands.java:92)
>   at 
> org.apache.zookeeper.server.admin.JettyAdminServer$CommandServlet.doGet(JettyAdminServer.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3124) Update the comment of special logic to handle cversion and pzxid in DataTree.processTxn

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606226#comment-16606226
 ] 

Hadoop QA commented on ZOOKEEPER-3124:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 22 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2126//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2126//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2126//console

This message is automatically generated.

> Update the comment of special logic to handle cversion and pzxid in 
> DataTree.processTxn
> ---
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
> when createNode, which is used to handle the cversion and pzxid not being 
> updated due to fuzzy snapshot: 
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
>  
> But is this a real issue, or is it still an issue for now?
> In the current code, when serializing a parent node, we'll lock on it, and 
> take a children snapshot at that time. If the child added after the parent is 
> serialized to disk, then it won't be written out, so we shouldn't hit the 
> issue where the child is in the snapshot but parent cversion and pzxid is not 
> changed.
>  
>  
> I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
> issue as well, I'm not sure why we added this, am I missing anything? Can we 
> just get rid of it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #612: [ZOOKEEPER-3131] Remove watcher when session closed in...

2018-09-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/612
  
Thanks @hanm @anmolnar, I like the "Rebuild" button, it's more convenient 
than check out the branch (maybe you're working on other branch and need to 
stash), do amend, and force push. 


---


[jira] [Reopened] (ZOOKEEPER-3124) Update the comment of special logic to handle cversion and pzxid in DataTree.processTxn

2018-09-06 Thread Fangmin Lv (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv reopened ZOOKEEPER-3124:
---

> Update the comment of special logic to handle cversion and pzxid in 
> DataTree.processTxn
> ---
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
> when createNode, which is used to handle the cversion and pzxid not being 
> updated due to fuzzy snapshot: 
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
>  
> But is this a real issue, or is it still an issue for now?
> In the current code, when serializing a parent node, we'll lock on it, and 
> take a children snapshot at that time. If the child added after the parent is 
> serialized to disk, then it won't be written out, so we shouldn't hit the 
> issue where the child is in the snapshot but parent cversion and pzxid is not 
> changed.
>  
>  
> I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
> issue as well, I'm not sure why we added this, am I missing anything? Can we 
> just get rid of it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-3124) Update the comment of special logic to handle cversion and pzxid in DataTree.processTxn

2018-09-06 Thread Fangmin Lv (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv updated ZOOKEEPER-3124:
--
Summary: Update the comment of special logic to handle cversion and pzxid 
in DataTree.processTxn  (was: Remove special logic to handle cversion and pzxid 
in DataTree.processTxn)

> Update the comment of special logic to handle cversion and pzxid in 
> DataTree.processTxn
> ---
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
> when createNode, which is used to handle the cversion and pzxid not being 
> updated due to fuzzy snapshot: 
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
>  
> But is this a real issue, or is it still an issue for now?
> In the current code, when serializing a parent node, we'll lock on it, and 
> take a children snapshot at that time. If the child added after the parent is 
> serialized to disk, then it won't be written out, so we shouldn't hit the 
> issue where the child is in the snapshot but parent cversion and pzxid is not 
> changed.
>  
>  
> I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
> issue as well, I'm not sure why we added this, am I missing anything? Can we 
> just get rid of it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #610: [ZOOKEEPER-3124] Remove special logic to handle cversi...

2018-09-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/610
  
I forgot but actually I think I've considered the children of children case 
when I created this diff, it actually won't cause the cversion and pzxid 
mismatch issue.

The tricky scenario here is that node is being deleted and recreated when 
taking fuzzy snapshot, the node is there and will be serialized, but the 
cversion and pzxid of parent may not be changed since it might already 
serialized.

I'll use this JIRA to update the comment to make it clearer, so people 
don't get confused as well in the future.


---


[GitHub] zookeeper pull request #610: [ZOOKEEPER-3124] Remove special logic to handle...

2018-09-06 Thread lvfangmin
GitHub user lvfangmin reopened a pull request:

https://github.com/apache/zookeeper/pull/610

[ZOOKEEPER-3124] Remove special logic to handle cversion and pzxid in 
DataTree.processTxn

There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
when createNode, which is used to handle the cversion and pzxid not being 
updated due to fuzzy snapshot: 


https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
 

But seems this is not a real issue, in the current code, when serializing a 
parent node, we'll lock on it, and take a children snapshot at that time. If 
the child added after the parent is serialized to disk, then it won't be 
written out, so we shouldn't hit the issue where the child is in the snapshot 
but parent cversion and pzxid is not changed.

But maybe I'm missing something, there is not much discussion in the Jira, 
so create a PR to have more attention.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3124

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #610


commit 53ab8f9773af6b763d246cd13dcd3ddd221e48d8
Author: Fangmin Lyu 
Date:   2018-08-28T18:08:28Z

Pzxid inconsistent issue when replaying a txn for a deleted node

commit c6901493ee3958b9e418d8854022b776ebd7df90
Author: Fangmin Lyu 
Date:   2018-08-28T20:39:34Z

Also remove the useless non-sense test in LoadFromLogNoServerTest

commit 93fa440f74e86f94657acd6d93905ff6763c1251
Author: Fangmin Lyu 
Date:   2018-08-31T23:41:18Z

trigger jenkins




---


[GitHub] zookeeper issue #530: [ZOOKEEPER-2474] Support session reconnection when usi...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/530
  
@timothyjward Okay, I hear you.
I'll check the Jira, because I think this should go to master too and 
commit soon.


---


[GitHub] zookeeper pull request #605: [ZOOKEEPER-3125] Fixing pzxid consistent issue ...

2018-09-06 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/605#discussion_r215614171
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/FuzzySnapshotRelatedTest.java 
---
@@ -0,0 +1,254 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.quorum;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import javax.security.sasl.SaslException;
+
+import org.apache.jute.OutputArchive;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooKeeper.States;
+import org.apache.zookeeper.data.Stat;
+import org.apache.zookeeper.server.DataNode;
+import org.apache.zookeeper.server.DataTree;
+import org.apache.zookeeper.server.ZKDatabase;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.test.ClientBase;
+
+import org.junit.Assert;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Test cases used to catch corner cases due to fuzzy snapshot.
+ */
+public class FuzzySnapshotRelatedTest extends QuorumPeerTestBase {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(FuzzySnapshotRelatedTest.class);
+
+MainThread[] mt = null;
+ZooKeeper[] zk = null;
+int leaderId;
+int followerA;
+
+@Before
+public void setup() throws Exception {
--- End diff --

I see this initialization logic pretty much everywhere in unit tests. Do 
you think we can do some clever code sharing here with (for example) 
`QuorumPeerMainTest.java`? Private method `LaunchServers` does the same thing 
as far as I could see.


---


[GitHub] zookeeper issue #590: [ZOOKEEPER-1177] Add the memory optimized watch manage...

2018-09-06 Thread nkalmar
Github user nkalmar commented on the issue:

https://github.com/apache/zookeeper/pull/590
  
systest will go to src/test/java , from my side, you can put the bench in 
org.apache.zookeeper.test.system . Thinking about it, that's a pretty good 
place.

No need to create main directory or anything, I will move all the files 
anyway. Just move the files amongst the others in systest. (Hopefully no 
package level dependency, I didn't check)

But there's a chance you will have to rebase if this this PR cannot be 
merged before the movement of all the remaining files as the directory 
refactor's last step. Sorry about that in advance... I'm not going to be the 
most popular with that PR :(


---


[GitHub] zookeeper issue #530: [ZOOKEEPER-2474] Support session reconnection when usi...

2018-09-06 Thread maoling
Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/530
  
@timothyjward 
Some inspirations can be found at the 
[old-patch](https://issues.apache.org/jira/secure/attachment/12822755/ZOOKEEPER-2474-01.patch).
This patch LGTM and I had tested it.  ping @anmolnar   


---


[GitHub] zookeeper issue #613: ZOOKEEPER-1823:LogFormatter should support printing tr...

2018-09-06 Thread maoling
Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/613
  
@anmolnar 
Yes,I will try to transport this improment to `TxnLogToolkit`.


---


[jira] [Commented] (ZOOKEEPER-2474) No way to reattach to a session when using ZKClientConfig

2018-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16605500#comment-16605500
 ] 

Hadoop QA commented on ZOOKEEPER-2474:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12822755/ZOOKEEPER-2474-01.patch
  against trunk revision 0b65e3d4cbb7647ea692bfea59c02e5b24bf821c.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3703//console

This message is automatically generated.

> No way to reattach to a session when using ZKClientConfig
> -
>
> Key: ZOOKEEPER-2474
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2474
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.2
>Reporter: Timothy Ward
>Assignee: Mohammad Arshad
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.6.0, 3.5.5
>
> Attachments: ZOOKEEPER-2474-01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The new constructors for ZooKeeper instances take a ZKClientConfig, which is 
> great, however there is no way to reattach to an existing session.
> New constructors should be added to allow passing a session id and password 
> when using ZKClientConfig.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #613: ZOOKEEPER-1823:LogFormatter should support printing tr...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/613
  
@maoling I intended to create TxnLogToolkit as a replacement for 
LogFormatter with some extra functionality. Do you mind checking that tool as 
well and see that this patch could be easily ported?
It would be nice to deco LogFormatter and keep only the new tool.


---


[GitHub] zookeeper issue #530: [ZOOKEEPER-2474] Support session reconnection when usi...

2018-09-06 Thread timothyjward
Github user timothyjward commented on the issue:

https://github.com/apache/zookeeper/pull/530
  
I'm still not sure what additional testing is possible. The way the patch 
is constructed the changed code (all three lines of it) is already exercised by 
the existing ZooKeeper tests. The new constructors all delegate to existing 
constructors, so I'm not sure how to create a "different" test rather than just 
duplicating another test, which seems like a poor practice to me.


---


[GitHub] zookeeper issue #612: [ZOOKEEPER-3131] Remove watcher when session closed in...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/612
  
I usually amend the latest commit and do a force push.


---


[GitHub] zookeeper issue #615: ZOOKEEPER-3137: add a utility to truncate logs to a zx...

2018-09-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/615
  
@enixon Thanks for looking into the integration.
Code sharing is not so important. TxnLogToolkit is intended to be a 
collection of all cli tools related to ZooKeeper transaction logs. So it's 
perfectly fine if you just drop the code into and add an extra option to 
trigger the new functionality. I really appreciate the effort.

@hanm TxnLogToolkit is kind of the rewrite of LogFormatter with extra 
functionality, so LogFormatter can be just deco'd with no harm.


---


[GitHub] zookeeper issue #530: [ZOOKEEPER-2474] Support session reconnection when usi...

2018-09-06 Thread maoling
Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/530
  
@timothyjward  Are you still working on this? 
Or could someone else pick up this?


---


[jira] [Resolved] (ZOOKEEPER-3124) Remove special logic to handle cversion and pzxid in DataTree.processTxn

2018-09-06 Thread Fangmin Lv (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv resolved ZOOKEEPER-3124.
---
Resolution: Won't Do

After revisiting the logic, I found it's not possible that the direct children 
is being creating and serialized after the parent is serialized, but the 
children of children might be added during fuzzy snapshot, so we do need to 
handle this.

> Remove special logic to handle cversion and pzxid in DataTree.processTxn
> 
>
> Key: ZOOKEEPER-3124
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There is special logic in the DataTree.processTxn to handle the NODEEXISTS 
> when createNode, which is used to handle the cversion and pzxid not being 
> updated due to fuzzy snapshot: 
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
>  
> But is this a real issue, or is it still an issue for now?
> In the current code, when serializing a parent node, we'll lock on it, and 
> take a children snapshot at that time. If the child added after the parent is 
> serialized to disk, then it won't be written out, so we shouldn't hit the 
> issue where the child is in the snapshot but parent cversion and pzxid is not 
> changed.
>  
>  
> I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this 
> issue as well, I'm not sure why we added this, am I missing anything? Can we 
> just get rid of it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #610: [ZOOKEEPER-3124] Remove special logic to handle cversi...

2018-09-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/610
  
After revisiting the logic, I found it's not possible that the direct 
children is being creating and serialized after the parent is serialized, but 
the children of children might be added during fuzzy snapshot, so we do need to 
handle this.


---


[GitHub] zookeeper pull request #610: [ZOOKEEPER-3124] Remove special logic to handle...

2018-09-06 Thread lvfangmin
Github user lvfangmin closed the pull request at:

https://github.com/apache/zookeeper/pull/610


---