[jira] [Created] (ZOOKEEPER-3831) Add a test that does a minimal validation of Apache Curator

2020-05-15 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3831:
---

 Summary: Add a test that does a minimal validation of Apache 
Curator
 Key: ZOOKEEPER-3831
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3831
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Affects Versions: 3.6.1
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman


Given that Apache Curator is one of the most widely used ZooKeeper clients it 
would be beneficial for ZooKeeper to have a minimal test to ensure that the 
codebase doesn't cause incompatibilities with Curator in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ZOOKEEPER-3762) Add Client/Server API to return available features

2020-03-18 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3762:
---

 Summary: Add Client/Server API to return available features
 Key: ZOOKEEPER-3762
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3762
 Project: ZooKeeper
  Issue Type: New Feature
  Components: c client, java client, server
Affects Versions: 3.6.0
Reporter: Jordan Zimmerman


Recent versions have introduced several new features/changes. Clients would 
benefit from an API that reports the feature set that a server instance 
supports. Something like (in Java):

{code}
public enum ServerFeatures {
TTL_NODES,
PERSISTENT_WATCHERS,
... etc ... full set of features TBD
}

public Collection< ServerFeatures> getServerFeatures() {
...
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ZOOKEEPER-3703) Publish a Test-Jar from ZooKeeper Server

2020-01-21 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3703:
---

 Summary: Publish a Test-Jar from ZooKeeper Server
 Key: ZOOKEEPER-3703
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3703
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Affects Versions: 3.5.6
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman
 Fix For: 3.6.0


It would be very helpful to Apache Curator and others if ZooKeeper published 
its testing code as a Maven Test JAR. Curator, for example, could use it to 
improve its testing server to make it easier to inject error conditions without 
having to have forced time delays and other hacks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ZOOKEEPER-3605) ZOOKEEPER-3242 add a connection throttle. Default constructor needs to set it

2019-11-02 Thread Jordan Zimmerman (Jira)
Jordan Zimmerman created ZOOKEEPER-3605:
---

 Summary: ZOOKEEPER-3242 add a connection throttle. Default 
constructor needs to set it
 Key: ZOOKEEPER-3605
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3605
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.6.0
Reporter: Jordan Zimmerman


ZOOKEEPER-3242 add a connection throttle. It gets set in the main constructor 
but not the alternate constructor. This is breaking Apache Curator's testing 
framework. It should also be set in the alternate constructor to avoid an NPE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ZOOKEEPER-3269) Testable facade would benefit from a queueEvent() method

2019-02-03 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-3269:
---

 Summary: Testable facade would benefit from a queueEvent() method
 Key: ZOOKEEPER-3269
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3269
 Project: ZooKeeper
  Issue Type: New Feature
  Components: java client
Reporter: Jordan Zimmerman
 Fix For: 3.6.0


For testing and other reasons it would be very useful to add a way to inject an 
event into ZooKeeper's event queue. ZooKeeper already has the {{Testable}} for 
features such as this (low level, backdoor, testing, etc.). This queueEvent 
method would be particularly helpful to Apache Curator and we'd very much 
appreciate its inclusion.

The method should have the signature:

{code}
void queueEvent(WatchedEvent event);
{code}

Calling this would have the affect of queueing an event into the clients queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2963) standalone

2018-02-14 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364826#comment-16364826
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2963:
-

This my favorite bug of all time.

> standalone
> --
>
> Key: ZOOKEEPER-2963
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2963
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: wu xiaoxue
>Assignee: maoling
>Priority: Major
>
> Today is Valentine's Day.I am still a single dog.
> When reading this line code annotation, I burst into tear.
> My New Year's Resolution is girlfriend(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZOOKEEPER-2971) Create release notes for 3.5.4

2018-01-28 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2971:
---

 Summary: Create release notes for 3.5.4
 Key: ZOOKEEPER-2971
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2971
 Project: ZooKeeper
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Assignee: Patrick Hunt
 Fix For: 3.5.4


ZOOKEEPER-2901 and ZOOKEEPER-2903 fix a serious bug with TTL nodes in 3.5.3. 
The release notes for 3.5.4 should describe the problem and how it was 
worked-around/fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-984) jenkins failure in testSessionMoved - NPE in quorum

2018-01-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310264#comment-16310264
 ] 

Jordan Zimmerman commented on ZOOKEEPER-984:


FWIW - we just saw this in a 3.5.3 instance. 

> jenkins failure in testSessionMoved - NPE in quorum
> ---
>
> Key: ZOOKEEPER-984
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-984
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Patrick Hunt
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: consoleText16.txt
>
>
> Got the following NPE on my internal jenkins setup running against released 
> 3.3.2 (see attached log)
> {noformat}
> [junit] 2011-02-06 10:39:56,988 - WARN  
> [QuorumPeer:/0.0.0.0:11365:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,988 - INFO  [SyncThread:3:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,989 - WARN  
> [QuorumPeer:/0.0.0.0:11364:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,989 - INFO  [SyncThread:2:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,990 - WARN  
> [QuorumPeer:/0.0.0.0:11363:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,990 - INFO  [SyncThread:5:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,990 - WARN  
> [QuorumPeer:/0.0.0.0:11366:Follower@116] - Got zxid 0x10001 expected 0x1
> [junit] 2011-02-06 10:39:56,990 - INFO  [SyncThread:1:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,991 - INFO  [SyncThread:4:FileTxnLog@197] - 
> Creating new log file: log.10001
> [junit] 2011-02-06 10:39:56,995 - INFO  
> [main-SendThread(localhost.localdomain:11363):ClientCnxn$SendThread@738] - 
> Session establishment complete on server 
> localhost.localdomain/127.0.0.1:11363, sessionid = 0x12dfc45e6dd, 
> negotiated timeout = 3
> [junit] 2011-02-06 10:39:56,996 - INFO  
> [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
> 0x12dfc45e6dd with negotiated timeout 3 for client /127.0.0.1:37810
> [junit] 2011-02-06 10:39:56,999 - INFO  [main:ZooKeeper@436] - Initiating 
> client connection, connectString=127.0.0.1:11364 sessionTimeout=3 
> watcher=org.apache.zookeeper.test.QuorumTest$5@248523a0 
> sessionId=85001345146093568 sessionPasswd=
> [junit] 2011-02-06 10:39:57,000 - INFO  
> [main-SendThread():ClientCnxn$SendThread@1041] - Opening socket connection to 
> server /127.0.0.1:11364
> [junit] 2011-02-06 10:39:57,000 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:NIOServerCnxn$Factory@251] - 
> Accepted socket connection from /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,001 - INFO  
> [main-SendThread(localhost.localdomain:11364):ClientCnxn$SendThread@949] - 
> Socket connection established to localhost.localdomain/127.0.0.1:11364, 
> initiating session
> [junit] 2011-02-06 10:39:57,002 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:NIOServerCnxn@770] - Client 
> attempting to renew session 0x12dfc45e6dd at /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,002 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11364:Learner@95] - Revalidating 
> client: 85001345146093568
> [junit] 2011-02-06 10:39:57,003 - INFO  
> [QuorumPeer:/0.0.0.0:11364:NIOServerCnxn@1580] - Established session 
> 0x12dfc45e6dd with negotiated timeout 3 for client /127.0.0.1:36682
> [junit] 2011-02-06 10:39:57,004 - INFO  
> [main-SendThread(localhost.localdomain:11364):ClientCnxn$SendThread@738] - 
> Session establishment complete on server 
> localhost.localdomain/127.0.0.1:11364, sessionid = 0x12dfc45e6dd, 
> negotiated timeout = 3
> [junit] 2011-02-06 10:39:57,005 - WARN  
> [CommitProcessor:2:NIOServerCnxn@1524] - Unexpected exception. Destruction 
> averted.
> [junit] java.lang.NullPointerException
> [junit]   at 
> org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
> [junit]   at 
> org.apache.zookeeper.proto.SetDataResponse.serialize(SetDataResponse.java:40)
> [junit]   at 
> org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
> [junit]   at 
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1500)
> [junit]   at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
> [junit]   at 
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> [junit] Running org.apache.zookeeper.test.QuorumTest
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
> [junit] Test 

[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-11-27 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267445#comment-16267445
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~phunt] - the ttlNodesEnabled now applies to stand alone mode too. I ported 
this to ZOOKEEPER-2903 as well.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-11-07 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242349#comment-16242349
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~phunt] Anything else needed before this can be merged along with 
ZOOKEEPER-2903?


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211143#comment-16211143
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2921:
-

Fair point - I updated the description...

Do we need a new threshold value or is re-using {{fsyncWarningThresholdMS}} 
sufficient?

> fsyncWarningThresholdMS is applied on each getChannel().force() - also needed 
> on entire commit
> --
>
> Key: ZOOKEEPER-2921
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Priority: Minor
>
> FileTxnLog.commit() has a warning when an individual sync takes longer than 
> {{fsyncWarningThresholdMS}}. However, it would also be useful to warn when 
> the entire commit operation takes longer than {{fsyncWarningThresholdMS}} as 
> this can cause client connection failures. Currently, commit() can take 
> longer than 2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2921:

Description: FileTxnLog.commit() has a warning when an individual sync 
takes longer than {{fsyncWarningThresholdMS}}. However, it would also be useful 
to warn when the entire commit operation takes longer than 
{{fsyncWarningThresholdMS}} as this can cause client connection failures. 
Currently, commit() can take longer than 2/3 of a session but still not log a 
warning.  (was: FileTxnLog.commit() has a warning when an individual sync takes 
longer than {{fsyncWarningThresholdMS}}. However, it would be more useful to 
warn when the entire commit operation takes longer than 
{{fsyncWarningThresholdMS}} as this can cause client connection failures. 
Currently, commit() can take longer than 2/3 of a session but still not log a 
warning.)

> fsyncWarningThresholdMS is applied on each getChannel().force() - also needed 
> on entire commit
> --
>
> Key: ZOOKEEPER-2921
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Priority: Minor
>
> FileTxnLog.commit() has a warning when an individual sync takes longer than 
> {{fsyncWarningThresholdMS}}. However, it would also be useful to warn when 
> the entire commit operation takes longer than {{fsyncWarningThresholdMS}} as 
> this can cause client connection failures. Currently, commit() can take 
> longer than 2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2921) fsyncWarningThresholdMS is applied on each getChannel().force() - also needed on entire commit

2017-10-19 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2921:
---

 Summary: fsyncWarningThresholdMS is applied on each 
getChannel().force() - also needed on entire commit
 Key: ZOOKEEPER-2921
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2921
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Priority: Minor


FileTxnLog.commit() has a warning when an individual sync takes longer than 
{{fsyncWarningThresholdMS}}. However, it would be more useful to warn when the 
entire commit operation takes longer than {{fsyncWarningThresholdMS}} as this 
can cause client connection failures. Currently, commit() can take longer than 
2/3 of a session but still not log a warning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207880#comment-16207880
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

derp - fixed

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207580#comment-16207580
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 10/17/17 4:34 PM:
---

Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Server IDs up to 255 unless you want TTLS then it's 254.


was (Author: randgalt):
Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Session IDs up to 255 unless you want TTLS then it's 254.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-17 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207580#comment-16207580
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Default is set back to false. I really don't think we need to deprecate the 
high bit as it's now documented and you have to opt in to it. So, keep using 
Session IDs up to 255 unless you want TTLS then it's 254.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2503) Inconsistency between myid documentation and implementation

2017-10-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189920#comment-16189920
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2503:
-

FYI - Please consider ZOOKEEPER-2901 when making this change. 

> Inconsistency between myid documentation and implementation
> ---
>
> Key: ZOOKEEPER-2503
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2503
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.9, 3.5.2
>Reporter: Michael Han
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> In ZK documentation, we have:
> "The myid file consists of a single line containing only the text of that 
> machine's id. So myid of server 1 would contain the text "1" and nothing 
> else. The id must be unique within the ensemble and should have a value 
> between 1 and 255."
> This however is not enforced in code, which should be fixed either in 
> documentation that we remove the restriction of the range 1-255 or in code we 
> enforce such constraint.
> Discussion thread:
> http://zookeeper-user.578899.n2.nabble.com/Is-myid-actually-limited-to-1-255-td7581270.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-10-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189858#comment-16189858
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~mjohnson207] - no. It wasn't checked before and there's already an issue for 
this: https://issues.apache.org/jira/browse/ZOOKEEPER-2503

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2907) Logged request buffer isn't useful

2017-09-28 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2907:

Description: 
There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.

PrepRequestProcessor#pRequest() and FinalRequestProcessor#processRequest()

  was:
There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.


> Logged request buffer isn't useful
> --
>
> Key: ZOOKEEPER-2907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2907
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Jordan Zimmerman
>Priority: Minor
>
> There are two places in the server code that log request errors with a 
> message ala "Dumping request buffer..." followed by a hex dump of the request 
> buffer. There are 2 major problems with this output:
> # The request type is not output
> # The byte-to-hex inline code doesn't pad numbers < 16
> These two combine to make the output data nearly useless.
> PrepRequestProcessor#pRequest() and FinalRequestProcessor#processRequest()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2907) Logged request buffer isn't useful

2017-09-28 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2907:
---

 Summary: Logged request buffer isn't useful
 Key: ZOOKEEPER-2907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2907
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3, 3.4.10
Reporter: Jordan Zimmerman
Priority: Minor


There are two places in the server code that log request errors with a message 
ala "Dumping request buffer..." followed by a hex dump of the request buffer. 
There are 2 major problems with this output:

# The request type is not output
# The byte-to-hex inline code doesn't pad numbers < 16

These two combine to make the output data nearly useless.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-22 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177575#comment-16177575
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

I'm happy to switch the default to true. I was being cautious. Can we get to 
consensus?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2903) Port ZOOKEEPER-2901 to 3.5.4

2017-09-22 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2903:
---

 Summary: Port ZOOKEEPER-2901 to 3.5.4
 Key: ZOOKEEPER-2903
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2903
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman
Priority: Blocker
 Fix For: 3.5.4


The TTL/Server ID bug is quite serious and should be back-ported to the 3.5.x 
branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175451#comment-16175451
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

There is another option here. We update the documentation to say that if you're 
going to use container and TTL nodes then your server ID must <= 127. Thoughts?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175342#comment-16175342
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Update - after researching this further, the exposure from Container Nodes 
doesn't exist. Container Nodes are denoted by ephemeralOwner of 
{{Long.MIN_VALUE}}. There can never be a session ID with this value so we're 
safe. Thus, the only exposure is for TTL nodes. I'm still researching and will 
continue to report back here.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2901:

Comment: was deleted

(was: The fix for this is straightforward. The hard part is backward 
compatibility:

* End users have data files with potentially corrupted data
** If they've used a ServerId > 127 with ZK versions 3.5.1+
** If they've used a ServerId > 63 with ZK version 3.5.3
* ContainerManager will treat ephemeral nodes created by servers with the bad 
Server IDs as container or TTL nodes. 

The fix created here _must_ expire these sessions so that they don't cause 
problems. The tricky part is how to do this. We need a way to identify old 
session IDs and new ones. We _could_ bump the {{FileTxnLog.VERSION}} but that 
would also be tricky to do in a backward compatible way. I'd appreciate ideas 
here.)

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175282#comment-16175282
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing `FileTxnLog.VERSION`. Every file (snapshot and transaction) has a 
header that maps to `FileHeader.java`. The field `dbId` isn't really used for 
anything. For snapshots it's -1 and transactions it's 0. So, we can easily use 
this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175282#comment-16175282
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/21/17 6:55 PM:
--

Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing {{FileTxnLog.VERSION}}. Every file (snapshot and transaction) has a 
header that maps to {{FileHeader.java}}. The field {{dbId}} isn't really used 
for anything. For snapshots it's -1 and transactions it's 0. So, we can easily 
use this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?


was (Author: randgalt):
Update: I believe I have a mechanism to identify pre 3.5.4 files without 
changing `FileTxnLog.VERSION`. Every file (snapshot and transaction) has a 
header that maps to `FileHeader.java`. The field `dbId` isn't really used for 
anything. For snapshots it's -1 and transactions it's 0. So, we can easily use 
this. For post 3.5.3 files we can make dbId 1 for snapshots and 2 for 
transactions (or whatever). When loading older files, we can invalidate any 
sessions.

Question:

What is the best way to invalidate sessions when loading transactions and 
snapshot files?

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175108#comment-16175108
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

The fix for this is straightforward. The hard part is backward compatibility:

* End users have data files with potentially corrupted data
** If they've used a ServerId > 127 with ZK versions 3.5.1+
** If they've used a ServerId > 63 with ZK version 3.5.3
* ContainerManager will treat ephemeral nodes created by servers with the bad 
Server IDs as container or TTL nodes. 

The fix created here _must_ expire these sessions so that they don't cause 
problems. The tricky part is how to do this. We need a way to identify old 
session IDs and new ones. We _could_ bump the {{FileTxnLog.VERSION}} but that 
would also be tricky to do in a backward compatible way. I'd appreciate ideas 
here.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ZOOKEEPER-2902) Exhibitor

2017-09-21 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman resolved ZOOKEEPER-2902.
-
Resolution: Invalid

[~ANH] as we said when you opened the previous issue. The Exhibitor is not 
related to Apache ZooKeeper. Please stop opening issues for it here.

> Exhibitor
> -
>
> Key: ZOOKEEPER-2902
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2902
> Project: ZooKeeper
>  Issue Type: Test
> Environment: Ubuntu 16.04
>Reporter: ANH
>
> Any one can help me in configuring exhibitor other than giving this link 
> https://github.com/soabase/exhibitor ?? 
> Extremely sorry to raise tickets related to Exhibitor.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2902) Exhibitor

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174810#comment-16174810
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2902 at 9/21/17 2:06 PM:
--

[~ANH] as we said when you opened the previous issue. The Exhibitor project is 
not related to Apache ZooKeeper. Please stop opening issues for it here.


was (Author: randgalt):
[~ANH] as we said when you opened the previous issue. The Exhibitor is not 
related to Apache ZooKeeper. Please stop opening issues for it here.

> Exhibitor
> -
>
> Key: ZOOKEEPER-2902
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2902
> Project: ZooKeeper
>  Issue Type: Test
> Environment: Ubuntu 16.04
>Reporter: ANH
>
> Any one can help me in configuring exhibitor other than giving this link 
> https://github.com/soabase/exhibitor ?? 
> Extremely sorry to raise tickets related to Exhibitor.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174792#comment-16174792
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/21/17 1:57 PM:
--

[~mjohnson207] - I think that can work. The hard part is deciding what to do 
about existing sessions when the new server loads. I think the only choice is 
to somehow invalidate those sessions. We need to do this because of this code 
in LeaderSessionTracker.java - which I don't understand TBH

{code}
/*
 * if local session is not enabled or it used to be our local session
 * throw sessions expires
 */
if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}
{code}

It's the only place in the code where the ServerId from the session ID is used.


was (Author: randgalt):
[~mjohnson207] - I think that can work. The hard part is deciding what to do 
about existing sessions when the new server loads. I think the only choice is 
to somehow invalidate those sessions. We need to do this because of this code 
in LeaderSessionTracker.java - which I don't understand TBH

{code}
/*
 * if local session is not enabled or it used to be our local session
 * throw sessions expires
 */
if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}
{code}

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-21 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174792#comment-16174792
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

[~mjohnson207] - I think that can work. The hard part is deciding what to do 
about existing sessions when the new server loads. I think the only choice is 
to somehow invalidate those sessions. We need to do this because of this code 
in LeaderSessionTracker.java - which I don't understand TBH

{code}
/*
 * if local session is not enabled or it used to be our local session
 * throw sessions expires
 */
if (!localSessionsEnabled
|| (getServerIdFromSessionId(sessionId) == serverId)) {
throw new SessionExpiredException();
}
{code}

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2901:

Comment: was deleted

(was: Note: This feature is only used by LeaderZooKeeperServer so it only 
applies to voting members, FYI)

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173808#comment-16173808
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

Note: This feature is only used by LeaderZooKeeperServer so it only applies to 
voting members, FYI

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173758#comment-16173758
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/20/17 8:11 PM:
--

There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:


{code:java}
// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static final long EPHEMERAL_MASK = 0x3FFFL;

public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}

{code}

_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.


was (Author: randgalt):
There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:


{code:java}
// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}

{code}

_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173758#comment-16173758
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/20/17 8:09 PM:
--

There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:


{code:java}
// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}

{code}

_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.


was (Author: randgalt):
There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:

{{// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}
}}
_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173758#comment-16173758
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2901 at 9/20/17 8:09 PM:
--

There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:

{{// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}
}}
_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.


was (Author: randgalt):
There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:

{{code}}
// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}
{{code}}

_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173758#comment-16173758
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

There are a number of possibilities to address this:

* Remove the feature completely
* Quick fix now, larger fix later
* Find another way to denote container/TTL

The problem is that ServerIDs > 63 will now appear to be TTL nodes (server IDs 
>= 255 will appear to be container nodes). 

Commentary:

_Remove the Feature completely_

I don't see how we can do this without breaking existing clients. Even if we 
remove TTLs, Container nodes have been out there for over a year (or more?). 
Container nodes has the same problem.

_Quick fix now, larger fix later_

The quick fix is to mask the 2 high bits of the Server ID when seeding the 
session ID. This has major implications for how the ServerID is chosen. But, 
this is beyond my knowledge. The way the Server ID is/was stored prior to 
TTL/Container nodes implied that the ServerID had to have unique bits across 
the ensemble. I need other committers to comment on this. To be clear, this is 
the change:

{{code}}
// in SessionTrackerImpl#initializeNextSession()

public static long initializeNextSession(long id) {
long nextSid;
nextSid = (Time.currentElapsedTime() << 24) >>> 8;
nextSid =  nextSid | (id <<56);
return EphemeralType.maskSessionId(nextSid);
}

// in EphemeralType
public static long maskSessionId(long id) {
return id & EPHEMERAL_MASK;
}
{{code}}

_Find another way to denote container/TTL_

I need ideas here. Not sure how to handle this in a backward compatible way.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2901:

Priority: Blocker  (was: Major)

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173704#comment-16173704
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2901:
-

The "client" ID is the same as the session ID. The session ID is an 
incrementing number. However, it appears that the "High order byte is 
serverId". Holy cow! How did this get by? This is major.

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-09-20 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman reassigned ZOOKEEPER-2901:
---

Assignee: Jordan Zimmerman

> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ZOOKEEPER-2900) Error: Could not find or load main class com.netflix.exhibitor.application.ExhibitorMain

2017-09-18 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman resolved ZOOKEEPER-2900.
-
Resolution: Invalid

[~ANH] This is the Apache ZooKeeper issues database. Exhibitor is a separate 
project from Apache ZooKeeper. https://github.com/soabase/exhibitor

> Error: Could not find or load main class 
> com.netflix.exhibitor.application.ExhibitorMain
> 
>
> Key: ZOOKEEPER-2900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2900
> Project: ZooKeeper
>  Issue Type: Task
> Environment: Ubuntu Server 16.04 LTS
>Reporter: ANH
>
> i am trying to set up an Exhibitor in an ubuntu server using Gradle . The 
> reference link is mentioned below.
> 1) https://blog.imaginea.com/monitoring-zookeeper-with-exhibitor/
> 2) https://github.com/soabase/exhibitor/wiki/Running-Exhibitor
> java -jar /home/ubuntu/gradle/build/libs/exhibitor-1.6.0.jar -C file-- this 
> commands results in an error { Error: Could not find or load main class 
> com.netflix.exhibitor.application.ExhibitorMain }
> What will be the solution ? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-18 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133267#comment-16133267
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

I ran a new benchmark (FYI - I'll make my benchmark project available in Sept. 
- I have limited internet until then). This benchmark does the following:

* Creates 10,000 ZNodes of random path length
* During the test, it randomly, deletes, creates or updates the nodes
* On each iteration of the test, it waits for the Curator cache to notice the 
change

I ran the benchmark with the existing TreeCache and with the new "CuratorCache" 
which takes advantage of PersistentRecursiveWatchers. Here are the ops/sec:

* TreeCache: approx 1005 ops/sec
* CuratorCache: approx 2563 ops/sec

That's a HUGE improvement (more than 2x). Not to mention that the CuratorCache 
in this instance requires only ONE watcher, not ~20,000 as TreeCache requires. 
For every ZNode deletion, TreeCache must call "exists()" to reset the watcher 
and CuratorCache doesn't need to make any ZK calls. For every ZNode creation, 
TreeCache must call both "getChildren()" and "getData()" while CuratorCache 
only needs to call "getData()".

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-16 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16129317#comment-16129317
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

Regarding the performance numbers above... They should be balanced by the 
enormous effort Curator's TreeCache class goes through to emulate 
Persistent/Recursive watches (which is essentially what it does). I argue that 
this change will be much more performant and efficient than what TreeCache is 
doing now.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-16 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16129128#comment-16129128
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

FYI - I did some micro benchmarking with jmh. The test iterates using 
PathParentIterator over a set of paths. The paths are:

{code}
"/a",
"/a/b",
"/a/b/c",
"/a really long path",
"/a really long path/with more than stuff",
"/a really long path/with more than stuff/and more",
"/a really long path/with more than stuff/and more/and more",
"/a really long path/with more than stuff/and more/and more/and more"
{code}

I did a test using {{PathParentIterator.forPathOnly()}} as a baseline and then 
with {{PathParentIterator.forAll()}}. Results:

* forPathOnly - avg 47,627,862 ops/s
* forAll - avg 22,677,073 ops/s

So, that's a significant difference but still 22+ million ops per second seems 
reasonable to me. I'd be curious what others think. FYI - I played around with 
optimizing PathParentIterator but haven't found a way yet to make it faster. 
Maybe we can just document that using Persistent watches can slightly slow 
overall server performance. Or is this a showstopper? In my view, the small 
performance hit is worth the feature. Importantly, the feature is optimized so 
that those that don't want it don't pay the performance penalty.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding 

[jira] [Commented] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121049#comment-16121049
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2871:
-

https://github.com/apache/zookeeper/pull/332

> Port ZOOKEEPER-1416 to 3.5.x
> 
>
> Key: ZOOKEEPER-2871
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, documentation, java client, server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4
>
>
> Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman reassigned ZOOKEEPER-2871:
---

Assignee: Jordan Zimmerman

> Port ZOOKEEPER-1416 to 3.5.x
> 
>
> Key: ZOOKEEPER-2871
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, documentation, java client, server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4
>
>
> Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2871:
---

 Summary: Port ZOOKEEPER-1416 to 3.5.x
 Key: ZOOKEEPER-2871
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: c client, documentation, java client, server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
 Fix For: 3.5.4


Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ZOOKEEPER-2648) Container node never gets deleted if it never had children

2017-08-01 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman reassigned ZOOKEEPER-2648:
---

Assignee: (was: Jordan Zimmerman)

> Container node never gets deleted if it never had children
> --
>
> Key: ZOOKEEPER-2648
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2648
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Hadriel Kaplan
>
> If a client creates a Container node, but does not also create a child within 
> that Container, the Container will never be deleted. This may seem like a bug 
> in the client for not subsequently creating a child, but we can't assume the 
> client remains connected, or that the client didn't just change its mind (due 
> to some recipe being canceled, for example).
> The bug is in ContainerManager.getCandidates(), which only considers a node a 
> candidate if its Cversion > 0. The comments indicate this was done 
> intentionally, to avoid a race condition whereby the Container was created 
> right before a cleaning period, and would get cleaned up before the child 
> could be created - so to avoid that the check is performed to verify the 
> Cversion > 0.
> Instead, I propose that if the Cversion is 0 but the Ctime is more than a 
> checkIntervalMs old, then it be deleted. In other words, if the Container 
> node has been around for a whole cleaning round already and no child has been 
>  created since, then go ahead and clean it up.
> I can provide a patch if others agree with such a change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ZOOKEEPER-2648) Container node never gets deleted if it never had children

2017-08-01 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman reassigned ZOOKEEPER-2648:
---

Assignee: Jordan Zimmerman

> Container node never gets deleted if it never had children
> --
>
> Key: ZOOKEEPER-2648
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2648
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Hadriel Kaplan
>Assignee: Jordan Zimmerman
>
> If a client creates a Container node, but does not also create a child within 
> that Container, the Container will never be deleted. This may seem like a bug 
> in the client for not subsequently creating a child, but we can't assume the 
> client remains connected, or that the client didn't just change its mind (due 
> to some recipe being canceled, for example).
> The bug is in ContainerManager.getCandidates(), which only considers a node a 
> candidate if its Cversion > 0. The comments indicate this was done 
> intentionally, to avoid a race condition whereby the Container was created 
> right before a cleaning period, and would get cleaned up before the child 
> could be created - so to avoid that the check is performed to verify the 
> Cversion > 0.
> Instead, I propose that if the Cversion is 0 but the Ctime is more than a 
> checkIntervalMs old, then it be deleted. In other words, if the Container 
> node has been around for a whole cleaning round already and no child has been 
>  created since, then go ahead and clean it up.
> I can provide a patch if others agree with such a change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-07-31 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107349#comment-16107349
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

[~hanm] This issue now has 13 votes - what can I do to get this merged? We had 
an issue over the weekend where the number of watches grew in the millions. 
This patch would make that situation never happen.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-08 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079263#comment-16079263
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

I think preventing deleteContainer from clients is the best bet. We could even 
have a class of opcodes that are marked "internal only".

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-07 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078251#comment-16078251
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2591 at 7/7/17 3:40 PM:
-

[~Bhupendra] - I don't understand how that would work. Any field that 
ContainerManager adds to the Request object could also be added by a rogue 
client. Can you give an example of how this would work?

Another possibility is to somehow disallow OpCode.deleteContainer coming from a 
connected client.


was (Author: randgalt):
[~Bhupendra] - I don't understand how that would work. Any field that 
ContainerManager adds to the Request object could also be added by a rogue 
client. Can you give an example of how this would work?

Another possibility is to someone disallow OpCode.deleteContainer coming from a 
connected client.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-07 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078251#comment-16078251
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2591 at 7/7/17 3:36 PM:
-

[~Bhupendra] - I don't understand how that would work. Any field that 
ContainerManager adds to the Request object could also be added by a rogue 
client. Can you give an example of how this would work?

Another possibility is to someone disallow OpCode.deleteContainer coming from a 
connected client.


was (Author: randgalt):
[~ Bhupendra] - I don't understand how that would work. Any field that 
ContainerManager adds to the Request object could also be added by a rogue 
client. Can you give an example of how this would work?

Another possibility is to someone disallow OpCode.deleteContainer coming from a 
connected client.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-07 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078251#comment-16078251
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

[~ Bhupendra] - I don't understand how that would work. Any field that 
ContainerManager adds to the Request object could also be added by a rogue 
client. Can you give an example of how this would work?

Another possibility is to someone disallow OpCode.deleteContainer coming from a 
connected client.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072606#comment-16072606
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

container deletion, itself, is different yet. But, my point is that ZooKeeper 
clients expect containers to disappear so there's no real security risk. The 
only edge case I can see is a rogue client quickly deleting a container. We can 
fix that edge case by applying the logic as I describe above.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072606#comment-16072606
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2591 at 7/3/17 3:12 PM:
-

container deletion, itself, is different yes. But, my point is that ZooKeeper 
clients expect containers to disappear so there's no real security risk. The 
only edge case I can see is a rogue client quickly deleting a container. We can 
fix that edge case by applying the logic as I describe above.


was (Author: randgalt):
container deletion, itself, is different yet. But, my point is that ZooKeeper 
clients expect containers to disappear so there's no real security risk. The 
only edge case I can see is a rogue client quickly deleting a container. We can 
fix that edge case by applying the logic as I describe above.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-03 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072539#comment-16072539
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

[~eribeiro] - I don't follow. The container node is created with an ACL. It 
uses the same create() method as normal node creation. A rogue client cannot 
delete child nodes without proper Auth.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-07-01 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071397#comment-16071397
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

That's an extreme edge case but it is possible. We can prevent that by 
enforcing the container check of "node.stat.getCversion() > 0" - that would be 
a lot easier than adding an ACL check in PrepRequestProcessor's handling of 
OpCode.deleteContainer

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-06-30 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070201#comment-16070201
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

Yeah, I guess that could happen. IMO it isn't a big deal. ZooKeeper 
applications are expecting these nodes to disappear after a while. The server 
only deletes the node if it has no children. 

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-06-29 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068397#comment-16068397
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2591 at 6/29/17 2:17 PM:
--

[~Bhupendra] If DeleteContainer had a client API then ACL would make sense. 
But, the automatic version has no client associated with the operation and 
therefore there is no ACL/Auth to apply.

Note: you _can_ delete containers from the client via normal delete() command 
and the ACL is respected.


was (Author: randgalt):
If DeleteContainer had a client [~Bhupendra] API then ACL would make sense. 
But, the automatic version has no client associated with the operation and 
therefore there is no ACL/Auth to apply.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission

2017-06-29 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068397#comment-16068397
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2591:
-

If DeleteContainer had a client [~Bhupendra] API then ACL would make sense. 
But, the automatic version has no client associated with the operation and 
therefore there is no ACL/Auth to apply.

> The deletion of Container znode doesn't check ACL delete permission
> ---
>
> Key: ZOOKEEPER-2591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security, server
>Reporter: Edward Ribeiro
>Assignee: Edward Ribeiro
>
> Container nodes check the ACL before creation, but the deletion doesn't check 
>  the ACL rights. The code below succeeds even tough we removed ACL access 
> permissions for "/a".
> {code}
> zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER);
> ArrayList list = new ArrayList<>();
> list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE));
> zk.setACL("/", list, -1);
> zk.delete("/a", -1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-06-01 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033362#comment-16033362
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

I don't mind changing this PR to be {{defaultACLForReconfig}} instead of 
{{skipDefaultACLForReconfig}} where {{defaultACLForReconfig}} is a number (the 
ACL bitset)

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-06-01 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033072#comment-16033072
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2779 at 6/1/17 2:32 PM:
-

[~fpj] I'm OK with either. But, this works best for us - i.e. restoring the old 
behavior (save the reasonable reconfigEnabled setting in zoo.cfg). The hack - 
from my view - is requiring that your entire ZooKeeper ensemble be open to a 
root-style single password (that is discoverable via a simple {{ps ax | grep 
java}} in order to use the reconfig() APIs.


was (Author: randgalt):
[~fpj] I'm OK with either. But, this works best for us - essentially restoring 
the old behavior. The hack - from my view - is requiring that your entire 
ZooKeeper ensemble be open to a root-style single password (that is 
discoverable via a simple {{ps ax | grep java}} in order to use the reconfig() 
APIs.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-06-01 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033072#comment-16033072
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

[~fpj] I'm OK with either. But, this works best for us - essentially restoring 
the old behavior. The hack - from my view - is requiring that your entire 
ZooKeeper ensemble be open to a root-style single password (that is 
discoverable via a simple {{ps ax | grep java}} in order to use the reconfig() 
APIs.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-30 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029769#comment-16029769
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

lol - we're going in circles. Your statement about "online" vs "offline" 
doesn't hold water. Forcing "super user" to make reconfig changes was - pardon 
my frankness - a very bad plan. It is a step backwards in security. The older 
method is much more secure when handled properly. Anyway, if you'll merge this 
change it would make us very happy.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-30 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029752#comment-16029752
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

I keep feeling like I'm not being understood or that I'm missing something 
huge. Please look at PrepRequestProcessor.java's handling of 
{{Opcode.reconfig}}:

{code}
nodeRecord = getRecordForPath(ZooDefs.CONFIG_NODE);   
checkACL(zks, nodeRecord.acl, ZooDefs.Perms.WRITE, request.authInfo);   
   
{code}

Unless, the ACL for "/zookeeper/config" is changed from Read Only 
reconfiguration will fail. What am I missing here?

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-30 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029732#comment-16029732
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

[~hanm] - The reconfig() APIs are useless until you set ACLs for 
/zookeeper/config. This is the "online" part. You cannot do any reconfig() 
operation until the read-only ACL has been changed.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-30 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029699#comment-16029699
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

How is the current mechanism "offline"? The ZooKeeper server must be running 
and you have to set the ACLs with a client. That's online. Offline would mean 
that it could be done entire from configuration - which would be great. But, 
that's not what we have in 3.5.3.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-26 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027002#comment-16027002
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

In our use case, we have an automated ZooKeeper installation that runs without 
human intervention. We need to be able to set proper ACLs on the reconfig nodes 
and also use the reconfig() APIs. In the 3.5.3, that is impossible. Reconfig is 
disabled unless you turn on super-user mode and fix the ACLs on 
{{/zookeeper/config}}. In our use case (and I imagine a great many others), at 
installation time there is zero risk - we know what we're doing. This is not 
running on an untrusted network. Forcing users to jump through hoops to use 
reconfig is strange - I still don't understand it.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-25 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024785#comment-16024785
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

For now, we've had to work around this by pre-building a small snapshot file 
with the the limiting ACL removed. When we build our ZooKeeper images we place 
this file in the data dir. Horrid hack, but it works. It would be much better 
to have this PR merged.

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-15 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011746#comment-16011746
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2779:
-

Ping - any feedback on this?

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-09 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2779:

Description: ZOOKEEPER-2014 changed the behavior of the /zookeeper/config 
node by setting the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes 
it very cumbersome to use the reconfig APIs. It also, perversely, makes 
security worse as the entire ZooKeeper instance must be opened to "super" user 
while enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a 
mechanism for savvy users to disable this ACL so that an application-specific 
custom ACL can be set.  (was: ZOOKEEPER-2014 changed the behavior of the 
/zookeeper/config node by setting the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. 
This change makes it very cumbersome to use the reconfig APIs. It also, 
perversely, makes security worse as the ZooKeeper instance must be opened to 
"super" user while enabled reconfig (per {{ReconfigExceptionTest.java}}). 
Provide a mechanism for savvy users to disable this ACL so that an 
application-specific custom ACL can be set.)

> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the entire ZooKeeper instance must be opened to "super" user while 
> enabled reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism 
> for savvy users to disable this ACL so that an application-specific custom 
> ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-09 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2779:
---

 Summary: Add option to not set ACL for reconfig node
 Key: ZOOKEEPER-2779
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman
 Fix For: 3.5.4, 3.6.0


ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
cumbersome to use the reconfig APIs. It also, perversely, makes security worse 
as the ZooKeeper instance must be opened to "super" user while enabled reconfig 
(per {{ReconfigExceptionTest.java}}). Provide a mechanism for savvy users to 
disable this ACL so that an application-specific custom ACL can be set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2014) Only admin should be allowed to reconfig a cluster

2017-05-09 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002692#comment-16002692
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2014:
-

I'm terribly sorry I was so late to this issue. Now that it's released I see 
even more problems. I just sent this email to @dev

{panel}
reconfig() is limited to "super" user. Perversely, this reduces security as 
"super" user is utterly insecure. Requiring new databases to be post-applied 
via super user creates a security hole. For the time that the new ACLs for 
/zookeeper/config are to be changed the ZooKeeper instance will be in "super" 
user mode. Additionally, having to do all this is terribly cumbersome. Lastly, 
the docs only make passing mention of this. I think users will be very 
surprised by this - especially as the docs refer users to 
ReconfigExceptionTest.java which isn't part of the client distribution.
{panel}

> Only admin should be allowed to reconfig a cluster
> --
>
> Key: ZOOKEEPER-2014
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2014
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Raul Gutierrez Segales
>Assignee: Michael Han
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, 
> ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, 
> ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, 
> ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, 
> ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, ZOOKEEPER-2014.patch, 
> ZOOKEEPER-2014.patch
>
>
> ZOOKEEPER-107 introduces reconfiguration support via the reconfig() call. We 
> should, at the very least, ensure that only the Admin can reconfigure a 
> cluster. Perhaps restricting access to /zookeeper/config as well, though this 
> is debatable. Surely one could ensure Admin only access via an ACL, but that 
> would leave everyone who doesn't use ACLs unprotected. We could also force a 
> default ACL to make it a bit more consistent (maybe).
> Finally, making reconfig() only available to Admins means they have to run 
> with zookeeper.DigestAuthenticationProvider.superDigest (which I am not sure 
> if everyone does, or how would it work with other authentication providers). 
> Review board https://reviews.apache.org/r/51546/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2260) Paginated getChildren call

2017-04-11 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964903#comment-15964903
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2260:
-

BTW - if this PR is accepted into ZooKeeper, we at Apache Curator would really 
appreciate a corresponding PR if you have the time.

> Paginated getChildren call
> --
>
> Key: ZOOKEEPER-2260
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2260
> Project: ZooKeeper
>  Issue Type: New Feature
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Marco P.
>Assignee: Marco P.
>  Labels: api, features
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-2260.patch, ZOOKEEPER-2260.patch
>
>
> Add pagination support to the getChildren() call, allowing clients to iterate 
> over children N at the time.
> Motivations for this include:
>   - Getting out of a situation where so many children were created that 
> listing them exceeded the network buffer sizes (making it impossible to 
> recover by deleting)[1]
>  - More efficient traversal of nodes with large number of children [2]
> I do have a patch (for 3.4.6) we've been using successfully for a while, but 
> I suspect much more work is needed for this to be accepted. 
> [1] https://issues.apache.org/jira/browse/ZOOKEEPER-272
> [2] https://issues.apache.org/jira/browse/ZOOKEEPER-282



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2719) Port ZOOKEEPER-2169 (TTL Nodes) to 3.5 branch

2017-03-31 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2719:

Summary: Port ZOOKEEPER-2169 (TTL Nodes) to 3.5 branch  (was: Port 
ZOOKEEPER-2169 to 3.5 branch)

> Port ZOOKEEPER-2169 (TTL Nodes) to 3.5 branch
> -
>
> Key: ZOOKEEPER-2719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2719
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: java client, server
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3
>
>
> ZOOKEEPER-2169 is a useful feature that should be deployed sooner than later. 
> Take the work done in the master branch and port it to the 3.5 branch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2608) Create CLI option for TTL ephemerals

2017-03-15 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926410#comment-15926410
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2608:
-

I wrote this before the moves to PRs. But, I'm happy to create a PR for this.

> Create CLI option for TTL ephemerals
> 
>
> Key: ZOOKEEPER-2608
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2608
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, java client, jute, server
>Reporter: Camille Fournier
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-2608-2.patch, ZOOKEEPER-2608-3.patch, 
> ZOOKEEPER-2608.patch
>
>
> Need to update CreateCommand to have the TTL node option



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2719) Port ZOOKEEPER-2169 to 3.5 branch

2017-03-15 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926349#comment-15926349
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2719:
-

The release audit problems seem to be a Jenkins issue and not related to this 
patch:

{noformat}
[rat:report]  !? 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build/zookeeper-3.5.3-alpha-SNAPSHOT/contrib/rest/conf/keys/rest.cer
[rat:report]  !? 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build/zookeeper-3.5.3-alpha-SNAPSHOT/src/contrib/rest/conf/keys/rest.cer
Lines that start with ? in the release audit report indicate files that do 
not have an Apache license header.
{noformat}


> Port ZOOKEEPER-2169 to 3.5 branch
> -
>
> Key: ZOOKEEPER-2719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2719
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: java client, server
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3
>
>
> ZOOKEEPER-2169 is a useful feature that should be deployed sooner than later. 
> Take the work done in the master branch and port it to the 3.5 branch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2719) Port ZOOKEEPER-2169 to 3.5 branch

2017-03-14 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924833#comment-15924833
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2719:
-

NOTE: this will also include ZOOKEEPER-2608

> Port ZOOKEEPER-2169 to 3.5 branch
> -
>
> Key: ZOOKEEPER-2719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2719
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: java client, server
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3
>
>
> ZOOKEEPER-2169 is a useful feature that should be deployed sooner than later. 
> Take the work done in the master branch and port it to the 3.5 branch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (ZOOKEEPER-2608) Create CLI option for TTL ephemerals

2017-03-14 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924817#comment-15924817
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-2608 at 3/14/17 7:09 PM:
--

This really should be merged given that ZOOKEEPER-2169 is merged. attn [~fournc]


was (Author: randgalt):
This really should be merged given that ZOOKEEPER-2169 is merged.

> Create CLI option for TTL ephemerals
> 
>
> Key: ZOOKEEPER-2608
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2608
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, java client, jute, server
>Reporter: Camille Fournier
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-2608-2.patch, ZOOKEEPER-2608-3.patch, 
> ZOOKEEPER-2608.patch
>
>
> Need to update CreateCommand to have the TTL node option



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2608) Create CLI option for TTL ephemerals

2017-03-14 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924817#comment-15924817
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2608:
-

This really should be merged given that ZOOKEEPER-2169 is merged.

> Create CLI option for TTL ephemerals
> 
>
> Key: ZOOKEEPER-2608
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2608
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, java client, jute, server
>Reporter: Camille Fournier
>Assignee: Jordan Zimmerman
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-2608-2.patch, ZOOKEEPER-2608-3.patch, 
> ZOOKEEPER-2608.patch
>
>
> Need to update CreateCommand to have the TTL node option



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2719) Port ZOOKEEPER-2169 to 3.5 branch

2017-03-14 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2719:

Fix Version/s: 3.5.3

> Port ZOOKEEPER-2169 to 3.5 branch
> -
>
> Key: ZOOKEEPER-2719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2719
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: java client, server
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3
>
>
> ZOOKEEPER-2169 is a useful feature that should be deployed sooner than later. 
> Take the work done in the master branch and port it to the 3.5 branch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2553) ZooKeeper cluster unavailable due to corrupted log file during power failures -- java.io.IOException: Unreasonable length

2017-03-13 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906944#comment-15906944
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2553:
-

https://blog.acolyer.org/2017/03/08/redundancy-does-not-imply-fault-tolerance-analysis-of-distributed-storage-reactions-to-single-errors-and-corruptions/

> ZooKeeper cluster unavailable due to corrupted log file during power failures 
> -- java.io.IOException: Unreasonable length
> -
>
> Key: ZOOKEEPER-2553
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2553
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8
> Environment: Normal ZooKeeper cluster with 3 nodes running Linux
>Reporter: Ramnatthan Alagappan
>
> I am running a three node ZooKeeper cluster. 
> When a new log file is created by ZooKeeper, I see the following sequence of 
> system calls:
> 1. creat(new_log)
> 2. write(new_log, count=16) // This is a log header I believe/
> 3. truncate(new_log, from 16 bytes to 16 KBytes) // I have configured the log 
> size to be 16K. 
> When the above sequence of operations complete, it is reasonable to expect 
> the newly created log file to contain the header(16 bytes) and then filled 
> with zeros till the end of the log.
> But when a crash occurs (due to a power failure), while the truncate system 
> call is in progress, it is possible for the log to contain garbage data when 
> the system restarts from the crash. Note that if the crash occurs just after 
> the truncate system call completes, then there is no problem. Basically, the 
> truncate needs to be atomically persisted for ZooKeeper to recover from 
> crashes correctly  or (more realistically) the recovery code needs to deal 
> with the case of expecting garbage in a newly created log. 
> As mentioned, if a crash occurs during the truncate system call, then 
> ZooKeeper will fail to start with the following exception. Here is the stack 
> trace:
> java.io.IOException: Unreasonable length = -295704495
> at 
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
> at 
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
> at 
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.(FileTxnLog.java:527)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
> at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting 
> abnormally
> java.lang.RuntimeException: Unable to run quorum server
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> Caused by: java.io.IOException: Unreasonable length = -295704495
> at 
> org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
> at 
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
> at 
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
> at 
> 

[jira] [Created] (ZOOKEEPER-2719) Port ZOOKEEPER-2169 to 3.5 branch

2017-03-12 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2719:
---

 Summary: Port ZOOKEEPER-2169 to 3.5 branch
 Key: ZOOKEEPER-2719
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2719
 Project: ZooKeeper
  Issue Type: New Feature
  Components: java client, server
Reporter: Jordan Zimmerman
Assignee: Jordan Zimmerman


ZOOKEEPER-2169 is a useful feature that should be deployed sooner than later. 
Take the work done in the master branch and port it to the 3.5 branch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2703) [MASTER ISSUE] Create benchmark/stability tests

2017-03-02 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892466#comment-15892466
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2703:
-

Ideally - the benchmarks would be done against something that looks like a 
Production ensemble. Can Apache give us resources for this? Does anyone know 
other sources for 3 (or 5?) machines to run periodically as a test ensemble?

> [MASTER ISSUE] Create benchmark/stability tests
> ---
>
> Key: ZOOKEEPER-2703
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2703
> Project: ZooKeeper
>  Issue Type: Test
>  Components: java client, recipes, tests
>Reporter: Jordan Zimmerman
>
> It would be useful to have objective tests/benchmarks. These tests/benchmarks 
> can be used to validate future changes to ZooKeeper, compare against other 
> similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
> candidates include:
> * leader election tests/benchmarks
> * service discovery tests/benchmarks
> * distributed locks tests/benchmarks
> * ...
> Note: each test/benchmark should be a sub-task under this master task



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2703) [MASTER ISSUE] Create benchmark/stability tests

2017-02-22 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2703:

Description: 
It would be useful to have objective tests/benchmarks. These tests/benchmarks 
can be used to validate future changes to ZooKeeper, compare against other 
similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
candidates include:

* leader election tests/benchmarks
* service discovery tests/benchmarks
* distributed locks tests/benchmarks
* ...

Note: each test/benchmark should be a sub-task under this master task

  was:
It would be useful to have objective tests/benchmarks. These tests/benchmarks 
can be used to validate future changes to ZooKeeper, compare against other 
similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
candidates include:

* leader election tests/benchmarks
* service discovery tests/benchmarks
* distributed locks tests/benchmarks
* ...

Note: each test should be a sub-task under this master task


> [MASTER ISSUE] Create benchmark/stability tests
> ---
>
> Key: ZOOKEEPER-2703
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2703
> Project: ZooKeeper
>  Issue Type: Test
>  Components: java client, recipes, tests
>Reporter: Jordan Zimmerman
>
> It would be useful to have objective tests/benchmarks. These tests/benchmarks 
> can be used to validate future changes to ZooKeeper, compare against other 
> similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
> candidates include:
> * leader election tests/benchmarks
> * service discovery tests/benchmarks
> * distributed locks tests/benchmarks
> * ...
> Note: each test/benchmark should be a sub-task under this master task



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2703) [MASTER ISSUE] Create benchmark/stability tests

2017-02-22 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2703:

Description: 
It would be useful to have objective tests/benchmarks. These tests/benchmarks 
can be used to validate future changes to ZooKeeper, compare against other 
similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
candidates include:

* leader election tests/benchmarks
* service discovery tests/benchmarks
* distributed locks tests/benchmarks
* ...

Note: each test should be a sub-task under this master task

  was:
It would be useful to have objective tests/benchmarks. These tests/benchmarks 
can be used to validate future changes to ZooKeeper, compare against other 
similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
candidates include:

* leader election tests/benchmarks
* service discovery tests/benchmarks
* distributed locks tests/benchmarks
* ...


> [MASTER ISSUE] Create benchmark/stability tests
> ---
>
> Key: ZOOKEEPER-2703
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2703
> Project: ZooKeeper
>  Issue Type: Test
>  Components: java client, recipes, tests
>Reporter: Jordan Zimmerman
>
> It would be useful to have objective tests/benchmarks. These tests/benchmarks 
> can be used to validate future changes to ZooKeeper, compare against other 
> similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
> candidates include:
> * leader election tests/benchmarks
> * service discovery tests/benchmarks
> * distributed locks tests/benchmarks
> * ...
> Note: each test should be a sub-task under this master task



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ZOOKEEPER-2703) [MASTER ISSUE] Create benchmark/stability tests

2017-02-22 Thread Jordan Zimmerman (JIRA)
Jordan Zimmerman created ZOOKEEPER-2703:
---

 Summary: [MASTER ISSUE] Create benchmark/stability tests
 Key: ZOOKEEPER-2703
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2703
 Project: ZooKeeper
  Issue Type: Test
  Components: java client, recipes, tests
Reporter: Jordan Zimmerman


It would be useful to have objective tests/benchmarks. These tests/benchmarks 
can be used to validate future changes to ZooKeeper, compare against other 
similar products (etcd/consul, etc.) or to help promote ZooKeeper. Possible 
candidates include:

* leader election tests/benchmarks
* service discovery tests/benchmarks
* distributed locks tests/benchmarks
* ...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841002#comment-15841002
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2464:
-

[~arshad.mohammad] IMO it should be a separate issue. 

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841000#comment-15841000
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2464:
-

[~eribeiro] - I think a 1 line change is too much for a test

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-01-25 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836532#comment-15836532
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-1416 at 1/25/17 9:39 PM:
--

I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

For simplicity look just at this class - it does the work: 
https://github.com/apache/curator/blob/1089eedc1a29469250c161a575e7b3bfb300d5d7/curator-recipes/src/main/java/org/apache/curator/framework/recipes/watch/InternalCuratorCache.java

update: actually the performance with this new feature will be _better_ than 
having to use one-time triggers. Note the use-case. People want _every_ event 
for a tree of nodes. This is a very common use case with ZK.

another thing: this change uses far, far, far less memory than the current 
alternative for writing a tree cache. Currently, you have to have watchers on 
every parent and every child recursively This escalates very quickly. The 
reason I picked this issue up in the first place was that we were seeing 
ridiculous memory usage with our TreeCache implementation. If we have this 
change, 1 watcher can watch an entire tree of nodes (again, a very common use 
case).


was (Author: randgalt):
I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

For simplicity look just at this class - it does the work: 
https://github.com/apache/curator/blob/1089eedc1a29469250c161a575e7b3bfb300d5d7/curator-recipes/src/main/java/org/apache/curator/framework/recipes/watch/InternalCuratorCache.java

update: actually the performance with this new feature will be _better_ than 
having to use one-time triggers. Note the use-case. People want _every_ event 
for a tree of nodes. This is a very common use case with ZK.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive 

[jira] [Comment Edited] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-01-24 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836532#comment-15836532
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-1416 at 1/24/17 7:56 PM:
--

I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

For simplicity look just at this class - it does the work: 
https://github.com/apache/curator/blob/1089eedc1a29469250c161a575e7b3bfb300d5d7/curator-recipes/src/main/java/org/apache/curator/framework/recipes/watch/InternalCuratorCache.java

update: actually the performance with this new feature will be _better_ than 
having to use one-time triggers. Note the use-case. People want _every_ event 
for a tree of nodes. This is a very common use case with ZK.


was (Author: randgalt):
I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

For simplicity look just at this class - it does the work: 
https://github.com/apache/curator/blob/1089eedc1a29469250c161a575e7b3bfb300d5d7/curator-recipes/src/main/java/org/apache/curator/framework/recipes/watch/InternalCuratorCache.java

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the 

[jira] [Comment Edited] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-01-24 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836532#comment-15836532
 ] 

Jordan Zimmerman edited comment on ZOOKEEPER-1416 at 1/24/17 7:53 PM:
--

I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

For simplicity look just at this class - it does the work: 
https://github.com/apache/curator/blob/1089eedc1a29469250c161a575e7b3bfb300d5d7/curator-recipes/src/main/java/org/apache/curator/framework/recipes/watch/InternalCuratorCache.java


was (Author: randgalt):
I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> 

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-01-24 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836532#comment-15836532
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

I'd note that this issue has 9 votes (including you it seems). I'm not sure 
what you want me to say. This would be an excellent addition to ZooKeeper that 
people have been asking for for years. Do you have issues with the 
implementation? I've already seen how it simplifies writing a TreeCache style 
implementation (here is the code: https://github.com/apache/curator/pull/181). 
The performance overhead for this is negligible when considering the use case. 
The purpose of this feature is to support what had to be done manually in 
Curator - TreeCache. Have a look a the TreeCache code and see how complex it 
is. Now compare that to https://github.com/apache/curator/pull/181 to see how 
much easier it is with this new API.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-01-24 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836468#comment-15836468
 ] 

Jordan Zimmerman commented on ZOOKEEPER-1416:
-

* "I think persistent and recursive are orthogonal" - maybe conceptually but in 
practice, no. Users usually watch a tree of nodes. {{TreeCache}} is one of the 
most widely used recipes in Curator.
* "mixed feeling about persistent watcher" - frankly, this is more important 
than the watcher being recursive. A great deal of code in Curator revolves 
around resetting watchers after they've triggered. For the vast majority of use 
cases users just want to set a watcher and have it continue to trigger. If we 
could go back to the start I'd advocate removing 1 time triggers altogether. 

Please note that Facebook is using something like this already internally. 

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2642) ZOOKEEPER-2014 breaks existing clients for little benefit

2017-01-17 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2642:

Attachment: ZOOKEEPER-2642-3.5.patch

same patch but based off of {{branch-3.5}}

> ZOOKEEPER-2014 breaks existing clients for little benefit
> -
>
> Key: ZOOKEEPER-2642
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2642
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.5.2
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2642-3.5.patch, ZOOKEEPER-2642.patch, 
> ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, 
> ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch
>
>
> ZOOKEEPER-2014 moved the reconfig() methods into a new class, ZooKeeperAdmin. 
> It appears this was done to document that these are methods have access 
> restrictions. However, this change breaks Apache Curator (and possibly other 
> clients). Curator APIs will have to be changed and/or special methods need to 
> be added. A breaking change of this kind should only be done when the benefit 
> is overwhelming. In this case, the same information can be conveyed with 
> documentation and possibly a deprecation notice.
> Revert the creation of the ZooKeeperAdmin class and move the reconfig() 
> methods back to the ZooKeeper class with additional documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2642) ZOOKEEPER-2014 breaks existing clients for little benefit

2017-01-11 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2642:

Attachment: ZOOKEEPER-2642.patch

Rebased against master

> ZOOKEEPER-2014 breaks existing clients for little benefit
> -
>
> Key: ZOOKEEPER-2642
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2642
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.5.2
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, 
> ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, 
> ZOOKEEPER-2642.patch
>
>
> ZOOKEEPER-2014 moved the reconfig() methods into a new class, ZooKeeperAdmin. 
> It appears this was done to document that these are methods have access 
> restrictions. However, this change breaks Apache Curator (and possibly other 
> clients). Curator APIs will have to be changed and/or special methods need to 
> be added. A breaking change of this kind should only be done when the benefit 
> is overwhelming. In this case, the same information can be conveyed with 
> documentation and possibly a deprecation notice.
> Revert the creation of the ZooKeeperAdmin class and move the reconfig() 
> methods back to the ZooKeeper class with additional documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2642) ZOOKEEPER-2014 breaks existing clients for little benefit

2017-01-11 Thread Jordan Zimmerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated ZOOKEEPER-2642:

Attachment: ZOOKEEPER-2642.patch

Fixed doc typo

> ZOOKEEPER-2014 breaks existing clients for little benefit
> -
>
> Key: ZOOKEEPER-2642
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2642
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.5.2
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, 
> ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch, ZOOKEEPER-2642.patch
>
>
> ZOOKEEPER-2014 moved the reconfig() methods into a new class, ZooKeeperAdmin. 
> It appears this was done to document that these are methods have access 
> restrictions. However, this change breaks Apache Curator (and possibly other 
> clients). Curator APIs will have to be changed and/or special methods need to 
> be added. A breaking change of this kind should only be done when the benefit 
> is overwhelming. In this case, the same information can be conveyed with 
> documentation and possibly a deprecation notice.
> Revert the creation of the ZooKeeperAdmin class and move the reconfig() 
> methods back to the ZooKeeper class with additional documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2368) Client watches are not disconnected on close

2017-01-09 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812267#comment-15812267
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2368:
-

[~timothyjward] - how about an alternate close() that has a flag to control 
this behavior?

> Client watches are not disconnected on close
> 
>
> Key: ZOOKEEPER-2368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2368
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Timothy Ward
>Assignee: Timothy Ward
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2368.patch
>
>
> If I have a ZooKeeper client connected to an ensemble then obviously I can 
> register watches. 
> If the client is disconnected (for example by a failing ensemble member) then 
> I get a disconnection event for all of my watches. If, on the other hand, my 
> client is closed then I *do not* get a disconnection event. This asymmetry 
> makes it really hard to clear up properly when using the asynchronous API, as 
> there is no way to "fail" data reads/updates when the client is closed.
> I believe that the correct behaviour should be for all watchers to receive a 
> disconnection event when the client is closed. The watchers can then respond 
> as appropriate, and can differentiate between a "server disconnect" and a 
> "client disconnect" by checking the ZooKeeper#getState() method. 
> This would not be a breaking behaviour change as Watchers are already 
> required to handle disconnection events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   >