[jira] [Commented] (ZOOKEEPER-1559) Learner should not snapshot uncommitted state

2014-12-02 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232487#comment-14232487
 ] 

Jacky007 commented on ZOOKEEPER-1559:
-

this has been done in ZOOKEEPER-1549 in 3.4-branch

> Learner should not snapshot uncommitted state
> -
>
> Key: ZOOKEEPER-1559
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1559
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: quorum
>Reporter: Flavio Junqueira
>Assignee: Hongchao Deng
>
> The code in Learner.java is a bit entangled for backward compatibility 
> reasons. We need to make sure that we can remove the calls to take a snapshot 
> without breaking it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1977) Calibrate initLimit dynamically

2014-08-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092638#comment-14092638
 ] 

Jacky007 commented on ZOOKEEPER-1977:
-

I don't like complicated mechanism. initLimit + (snapshot+log) * failedPeersNum 
/ bandwidth works in our internal environment.

> Calibrate initLimit dynamically
> ---
>
> Key: ZOOKEEPER-1977
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1977
> Project: ZooKeeper
>  Issue Type: Wish
>Reporter: Flavio Junqueira
>
> We have seen a number of times users failing to get an ensemble up because 
> the snapshot transfer times out. We should be able to do better than this and 
> calibrate initLimit dynamically. I was thinking concretely that we could have 
> servers increasing the initLimit value (e.g., doubling or increments of 1) 
> upon socket timeouts. The tricky part here is that we need both ends of the 
> communication to increase it. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1977) Calibrate initLimit dynamically

2014-08-03 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084303#comment-14084303
 ] 

Jacky007 commented on ZOOKEEPER-1977:
-

In ZAB v1.0:
FOLLOWERINFO  --> Leader
LEADERINFO  <-- Leader
ACK EPOCH  --> Leader
do send snap/diff/trunc
UPTODATE <-- Leader

We can calibrate initLimit by Leader, and tell peers by LEADERINFO.

> Calibrate initLimit dynamically
> ---
>
> Key: ZOOKEEPER-1977
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1977
> Project: ZooKeeper
>  Issue Type: Wish
>Reporter: Flavio Junqueira
>
> We have seen a number of times users failing to get an ensemble up because 
> the snapshot transfer times out. We should be able to do better than this and 
> calibrate initLimit dynamically. I was thinking concretely that we could have 
> servers increasing the initLimit value (e.g., doubling or increments of 1) 
> upon socket timeouts. The tricky part here is that we need both ends of the 
> communication to increase it. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1519) Zookeeper Async calls can reference free()'d memory

2013-11-13 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822117#comment-13822117
 ] 

Jacky007 commented on ZOOKEEPER-1519:
-

I don't think this is a problem. The caller should hold the json.dumps(value), 
and free it in callback if needed.

> Zookeeper Async calls can reference free()'d memory
> ---
>
> Key: ZOOKEEPER-1519
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1519
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.3, 3.3.6
> Environment: Ubuntu 11.10, Ubuntu packaged Zookeeper 3.3.3 with some 
> backported fixes.
>Reporter: Mark Gius
>Assignee: Daniel Lescohier
> Fix For: 3.4.6, 3.5.0
>
> Attachments: zookeeper-1519.patch
>
>
> zoo_acreate() and zoo_aset() take a char * argument for data and prepare a 
> call to zookeeper.  This char * doesn't seem to be duplicated at any point, 
> making it possible that the caller of the asynchronous function might 
> potentially free() the char * argument before the zookeeper library completes 
> its request.  This is unlikely to present a real problem unless the freed 
> memory is re-used before zookeeper consumes it.  I've been unable to 
> reproduce this issue using pure C as a result.
> However, ZKPython is a whole different story.  Consider this snippet:
>   ok = zookeeper.acreate(handle, path, json.dumps(value), 
>  acl, flags, callback)
>   assert ok == zookeeper.OK
> In this snippet, json.dumps() allocates a string which is passed into the 
> acreate().  When acreate() returns, the zookeeper request has been 
> constructed with a pointer to the string allocated by json.dumps().  Also 
> when acreate() returns, that string is now referenced by 0 things (ZKPython 
> doesn't bump the refcount) and the string is eligible for garbage collection 
> and re-use.  The Zookeeper request now has a pointer to dangerous freed 
> memory.
> I've been seeing odd behavior in our development environments for some time 
> now, where it appeared as though two separate JSON payloads had been joined 
> together.  Python has been allocating a new JSON string in the middle of the 
> old string that an incomplete zookeeper async call had not yet processed.
> I am not sure if this is a behavior that should be documented, or if the C 
> binding implementation needs to be updated to create copies of the data 
> payload provided for aset and acreate.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-10-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800396#comment-13800396
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

LGTM, thanks [~fpj]

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: ZOOKEEPER-1667-b3.4.patch, ZOOKEEPER-1667.patch, 
> ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-10-16 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796504#comment-13796504
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

Hi, Flavio, could you generate a patch for it? There isn't an easy way for
me to do that...


2013/10/14 Flavio Junqueira (JIRA) 



> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch, ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-10-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792445#comment-13792445
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

Sorry, I'll do it in the weekend.

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch, ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1485) client xid overflow is not handled

2013-10-09 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790264#comment-13790264
 ] 

Jacky007 commented on ZOOKEEPER-1485:
-

As I said,it may core or hang after about ten days on one of our internal 
services.

> client xid overflow is not handled
> --
>
> Key: ZOOKEEPER-1485
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1485
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.3, 3.3.5
>Reporter: Michi Mutsuzaki
>Assignee: Bruce Gao
>
> Both Java and C clients use signed 32-bit int as XIDs. XIDs are assumed to be 
> non-negative, and zookeeper uses some negative values as special XIDs (e.g. 
> -2 for ping, -4 for auth). However, neither Java nor C client ensures the 
> XIDs it generates are non-negative, and the server doesn't reject negative 
> XIDs.
> Pat had some suggestions on how to fix this:
> - (bin-compat) Expire the session when the client sends a negative XID.
> - (bin-incompat) In addition to expiring the session, use 64-bit int for XID 
> so that overflow will practically never happen.
> --Michi



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1485) client xid overflow is not handled

2013-10-09 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790263#comment-13790263
 ] 

Jacky007 commented on ZOOKEEPER-1485:
-

We fixed this by use 31-bit only. 

> client xid overflow is not handled
> --
>
> Key: ZOOKEEPER-1485
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1485
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.3, 3.3.5
>Reporter: Michi Mutsuzaki
>Assignee: Bruce Gao
>
> Both Java and C clients use signed 32-bit int as XIDs. XIDs are assumed to be 
> non-negative, and zookeeper uses some negative values as special XIDs (e.g. 
> -2 for ping, -4 for auth). However, neither Java nor C client ensures the 
> XIDs it generates are non-negative, and the server doesn't reject negative 
> XIDs.
> Pat had some suggestions on how to fix this:
> - (bin-compat) Expire the session when the client sends a negative XID.
> - (bin-incompat) In addition to expiring the session, use 64-bit int for XID 
> so that overflow will practically never happen.
> --Michi



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-06-05 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676717#comment-13676717
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

I get the definition in zookeeper.h
{quote}
/**
 * @name Watch Types
 * These constants indicate the event that caused the watch event. They are
 * possible values of the first parameter of the watcher callback.
 */
// @{
/**
 * \brief a node has been created.
 *
 * This is only generated by watches on non-existent nodes. These watches
 * are set using \ref zoo_exists.
 */
extern ZOOAPI const int ZOO_CREATED_EVENT;
/**
 * \brief a node has been deleted.
 *
 * This is only generated by watches on nodes. These watches
 * are set using \ref zoo_exists and \ref zoo_get.
 */
extern ZOOAPI const int ZOO_DELETED_EVENT;
/**
 * \brief a node has changed.
 *
 * This is only generated by watches on nodes. These watches
 * are set using \ref zoo_exists and \ref zoo_get.
 */
extern ZOOAPI const int ZOO_CHANGED_EVENT;
/**
 * \brief a change as occurred in the list of children.
 *
 * This is only generated by watches on the child list of a node. These watches
 * are set using \ref zoo_get_children or \ref zoo_get_children2.
 */
extern ZOOAPI const int ZOO_CHILD_EVENT;
{quote}
The only missing one is zoo_get_children /foo will get a ZOO_DELETED_EVENT if 
/foo is deleted.

Now, if zk1 set an exists watch on /foo (not exists), we may get a 
ZOO_CHANGED_EVENT. I agree with it will have some additional information, but 
it is too hard to understand, use or document.

Actually, I prefer to believe it is the original semantics. 


> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: ZOOKEEPER-1667.patch, ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-05-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662994#comment-13662994
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

Add a patch against 3.4-branch. [~fpj]  Would you please review the patch?

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: ZOOKEEPER-1667.patch, ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-05-21 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Attachment: ZOOKEEPER-1667-r34.patch

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Assignee: Jacky007
>Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: ZOOKEEPER-1667.patch, ZOOKEEPER-1667-r34.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1693) process may core or hang when xid is overflowed

2013-04-19 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1693:


Description: 
The xid will be confused with AUTHXID(-4) when it is overflowed.

If the process send 4000 requests per second, it may core or hang after about 
ten days.

  was:
The xid will be confused with AUTHXID(-4) when it is overflowed.

If the process send a get 4000/s, it may core or hang after about ten days.


> process may core or hang when xid is overflowed
> ---
>
> Key: ZOOKEEPER-1693
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1693
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.5
>Reporter: Jacky007
>
> The xid will be confused with AUTHXID(-4) when it is overflowed.
> If the process send 4000 requests per second, it may core or hang after about 
> ten days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1693) process may core or hang when xid is overflowed

2013-04-19 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1693:


Description: 
The xid will be confused with AUTHXID(-4) when it is overflowed.

If the process send a get 4000/s, it may core or hang after about ten days.

  was:The xid will be confused with AUTHXID(-4) when it is overflowed.


> process may core or hang when xid is overflowed
> ---
>
> Key: ZOOKEEPER-1693
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1693
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.5
>Reporter: Jacky007
>
> The xid will be confused with AUTHXID(-4) when it is overflowed.
> If the process send a get 4000/s, it may core or hang after about ten days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1693) process may core or hang when xid is overflowed

2013-04-19 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1693:


Description: The xid will be confused with AUTHXID(-4) when it is 
overflowed.  (was: The xid will be confused with AUTHXID when it is overflowed.)

> process may core or hang when xid is overflowed
> ---
>
> Key: ZOOKEEPER-1693
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1693
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.5
>Reporter: Jacky007
>
> The xid will be confused with AUTHXID(-4) when it is overflowed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1693) process may core or hang when xid is overflowed

2013-04-19 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1693:


Summary: process may core or hang when xid is overflowed  (was: process may 
core or hang when xid is overflow)

> process may core or hang when xid is overflowed
> ---
>
> Key: ZOOKEEPER-1693
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1693
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client, java client
>Affects Versions: 3.4.5
>Reporter: Jacky007
>
> The xid will be confused with AUTHXID when it is overflowed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1693) process may core or hang when xid is overflow

2013-04-19 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1693:
---

 Summary: process may core or hang when xid is overflow
 Key: ZOOKEEPER-1693
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1693
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client, java client
Affects Versions: 3.4.5
Reporter: Jacky007


The xid will be confused with AUTHXID when it is overflowed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1690) Race condition when close sock may cause a NPE in sendBuffer

2013-04-15 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1690:


Summary: Race condition when close sock may cause a NPE in sendBuffer   
(was: Race condition when close session may cause a NPE in sendBuffer )

> Race condition when close sock may cause a NPE in sendBuffer 
> -
>
> Key: ZOOKEEPER-1690
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1690
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Jacky007
>
> In NIOServerCnxn.java
>  public void close() {
> closeSock();
> ...
> sk.cancel();
> Close sock first, then cancel the channel.
> 
> public void sendBuffer(ByteBuffer bb) {
> if ((sk.interestOps() & SelectionKey.OP_WRITE) == 0) {
> ...
> sock.write(bb);
> Get ops of the channel, then read sock (may be null)
> I have noticed that the 3.5.0-branch has fixed the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1687) Number of past transactions retains in ZKDatabase.committedLog should be configurable

2013-04-15 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631671#comment-13631671
 ] 

Jacky007 commented on ZOOKEEPER-1687:
-

Duplicated to ZOOKEEPER-1473.

> Number of past transactions retains in ZKDatabase.committedLog should be 
> configurable
> -
>
> Key: ZOOKEEPER-1687
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1687
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Mathias H.
>Priority: Minor
>
> ZKDatabase.committedLog retains the past 500 transactions. In case of memory 
> usage is more important than speed and vice versa, this should be 
> configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1690) Race condition when close session may cause a NPE in sendBuffer

2013-04-15 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1690:
---

 Summary: Race condition when close session may cause a NPE in 
sendBuffer 
 Key: ZOOKEEPER-1690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1690
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.4.6
Reporter: Jacky007


In NIOServerCnxn.java
 public void close() {
closeSock();
...
sk.cancel();

Close sock first, then cancel the channel.

public void sendBuffer(ByteBuffer bb) {
if ((sk.interestOps() & SelectionKey.OP_WRITE) == 0) {
...
sock.write(bb);

Get ops of the channel, then read sock (may be null)

I have noticed that the 3.5.0-branch has fixed the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1552) Enable sync request processor in Observer

2013-04-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628877#comment-13628877
 ] 

Jacky007 commented on ZOOKEEPER-1552:
-

HI [~thawan], we commit proposal to SyncRequestprocessor, but 
SyncRequestprocessor may not flush it immediately.
Does we really expect that, throuth it's not a big deal to lose some txns here.
If we should make sure of that, we can make a new processor as the next 
processor of SyncRequestprocessor to execute "commitProcessor.commit(request)".
Or some comments are help to clarify.

> Enable sync request processor in Observer
> -
>
> Key: ZOOKEEPER-1552
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1552
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum, server
>Affects Versions: 3.4.3
>Reporter: Thawan Kooburat
>Assignee: Thawan Kooburat
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1552.patch, ZOOKEEPER-1552.patch, 
> ZOOKEEPER-1552.patch
>
>
> Observer doesn't forward its txns to SyncRequestProcessor. So it never 
> persists the txns onto disk or periodically creates snapshots. This increases 
> the start-up time since it will get the entire snapshot if the observer has 
> be running for a long time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1675) Make sync a quorum operation

2013-04-01 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618739#comment-13618739
 ] 

Jacky007 commented on ZOOKEEPER-1675:
-

Sorry, it has nothing to do with correctness. What I want to tell is we may 
need more codes to achieve a really "strong read".

> Make sync a quorum operation
> 
>
> Key: ZOOKEEPER-1675
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1675
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Alexander Shraer
>
> sync + read is supposed to return at least the latest write that completes 
> before the sync starts. This is true if the leader doesn't change, but when 
> it does it may not work. The problem happens when the old leader L1 still 
> thinks that it is the leader but some other leader L2 was already elected and 
> committed some operations. Suppose that follower F is connected to L1 and 
> invokes a sync. Even though L1 responds to the sync, the recent operations 
> committed by L2 will not be flushed to F so a subsequent read on F will not 
> see these operations. 
> To prevent this we should broadcast the sync like updates.
> This problem is also mentioned in Section 4.4 of the ZooKeeper peper (but the 
> proposed solution there is insufficient to solve the issue).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1675) Make sync a quorum operation

2013-03-31 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618591#comment-13618591
 ] 

Jacky007 commented on ZOOKEEPER-1675:
-

sync is an asynchronous operation, we usually ignore the result. we may need a 
multi-op to make it atomic.

> Make sync a quorum operation
> 
>
> Key: ZOOKEEPER-1675
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1675
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Alexander Shraer
>
> sync + read is supposed to return at least the latest write that completes 
> before the sync starts. This is true if the leader doesn't change, but when 
> it does it may not work. The problem happens when the old leader L1 still 
> thinks that it is the leader but some other leader L2 was already elected and 
> committed some operations. Suppose that follower F is connected to L1 and 
> invokes a sync. Even though L1 responds to the sync, the recent operations 
> committed by L2 will not be flushed to F so a subsequent read on F will not 
> see these operations. 
> To prevent this we should broadcast the sync like updates.
> This problem is also mentioned in Section 4.4 of the ZooKeeper peper (but the 
> proposed solution there is insufficient to solve the issue).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1675) Make sync a quorum operation

2013-03-28 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616138#comment-13616138
 ] 

Jacky007 commented on ZOOKEEPER-1675:
-

what is different between the quorum sync and 'empty' set ?
 
there is another problem that sync is a async, we ignore the result usually. If 
we connect to another server before get the response,  the followed read may 
get a stale value.

> Make sync a quorum operation
> 
>
> Key: ZOOKEEPER-1675
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1675
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Alexander Shraer
>
> sync + read is supposed to return at least the latest write that completes 
> before the sync starts. This is true if the leader doesn't change, but when 
> it does it may not work. The problem happens when the old leader L1 still 
> thinks that it is the leader but some other leader L2 was already elected and 
> committed some operations. Suppose that follower F is connected to L1 and 
> invokes a sync. Even though L1 responds to the sync, the recent operations 
> committed by L2 will not be flushed to F so a subsequent read on F will not 
> see these operations. 
> To prevent this we should broadcast the sync like updates.
> This problem is also mentioned in Section 4.4 of the ZooKeeper peper (but the 
> proposed solution there is insufficient to solve the issue).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-03-26 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614812#comment-13614812
 ] 

Jacky007 commented on ZOOKEEPER-1667:
-

I think it is. Do we need a patch for 3.4.6?

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1674) There is no need to clear & load the database across leader election

2013-03-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609823#comment-13609823
 ] 

Jacky007 commented on ZOOKEEPER-1674:
-

[~fpj] I have noticed the discussions in 
[ZOOKEEPER-1642|https://issues.apache.org/jira/browse/ZOOKEEPER-1642].
This issue is related to  what [~thawan] talks about in  
[ZOOKEEPER-1642|https://issues.apache.org/jira/browse/ZOOKEEPER-1642].

It is no need to clear zkDb when shutdown in ZooKeeperServer.java. 
{quote}
public void shutdown() {
...
if (zkDb != null) {
zkDb.clear();
}
...
}
{quote}


> There is no need to clear & load the database across leader election
> 
>
> Key: ZOOKEEPER-1674
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1674
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Jacky007
>
> It is interesting to notice the piece of codes in QuorumPeer.java
>  /* ZKDatabase is a top level member of quorumpeer 
>  * which will be used in all the zookeeperservers
>  * instantiated later. Also, it is created once on 
>  * bootup and only thrown away in case of a truncate
>  * message from the leader
>  */
> private ZKDatabase zkDb;
> It is introduced by ZOOKEEPER-596. Now, we just drop the database every 
> leader election.
> We can keep it safely with ZOOKEEPER-1549.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1674) There is no need to clear & load the database across leader election

2013-03-21 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1674:


Summary: There is no need to clear & load the database across leader 
election  (was: There is no need to reload database cross leader election)

> There is no need to clear & load the database across leader election
> 
>
> Key: ZOOKEEPER-1674
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1674
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Jacky007
>
> It is interesting to notice the piece of codes in QuorumPeer.java
>  /* ZKDatabase is a top level member of quorumpeer 
>  * which will be used in all the zookeeperservers
>  * instantiated later. Also, it is created once on 
>  * bootup and only thrown away in case of a truncate
>  * message from the leader
>  */
> private ZKDatabase zkDb;
> It is introduced by ZOOKEEPER-596. Now, we just drop the database every 
> leader election.
> We can keep it safely with ZOOKEEPER-1549.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1674) There is no need to reload database cross leader election

2013-03-21 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1674:


Description: 
It is interesting to notice the piece of codes in QuorumPeer.java

 /* ZKDatabase is a top level member of quorumpeer 
 * which will be used in all the zookeeperservers
 * instantiated later. Also, it is created once on 
 * bootup and only thrown away in case of a truncate
 * message from the leader
 */
private ZKDatabase zkDb;

It is introduced by ZOOKEEPER-596. Now, we just drop the database every leader 
election.

We can keep it safely with ZOOKEEPER-1549.


  was:
It is interesting to notice the piece of code in QuorumPeer.java

 /* ZKDatabase is a top level member of quorumpeer 
 * which will be used in all the zookeeperservers
 * instantiated later. Also, it is created once on 
 * bootup and only thrown away in case of a truncate
 * message from the leader
 */
private ZKDatabase zkDb;

It is introduced by ZOOKEEPER-596. Now, we just drop the database every leader 
election.

We can keep it safely with ZOOKEEPER-1549.



> There is no need to reload database cross leader election
> -
>
> Key: ZOOKEEPER-1674
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1674
> Project: ZooKeeper
>  Issue Type: Improvement
>Reporter: Jacky007
>
> It is interesting to notice the piece of codes in QuorumPeer.java
>  /* ZKDatabase is a top level member of quorumpeer 
>  * which will be used in all the zookeeperservers
>  * instantiated later. Also, it is created once on 
>  * bootup and only thrown away in case of a truncate
>  * message from the leader
>  */
> private ZKDatabase zkDb;
> It is introduced by ZOOKEEPER-596. Now, we just drop the database every 
> leader election.
> We can keep it safely with ZOOKEEPER-1549.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2013-03-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608983#comment-13608983
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

By the way, I think we should fix 
[ZOOKEEPER-1667|https://issues.apache.org/jira/browse/ZOOKEEPER-1667]

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Assignee: Thawan Kooburat
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: case.patch, ZOOKEEPER-1549-3.4.patch, 
> ZOOKEEPER-1549-learner.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2013-03-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608979#comment-13608979
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

[~thawan]  It sounds the right time to fix this.

With this patch, it should be safe to do 
[ZOOKEEPER-1674|https://issues.apache.org/jira/browse/ZOOKEEPER-1674]

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Assignee: Thawan Kooburat
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: case.patch, ZOOKEEPER-1549-3.4.patch, 
> ZOOKEEPER-1549-learner.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1674) There is no need to reload database cross leader election

2013-03-21 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1674:
---

 Summary: There is no need to reload database cross leader election
 Key: ZOOKEEPER-1674
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1674
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jacky007


It is interesting to notice the piece of code in QuorumPeer.java

 /* ZKDatabase is a top level member of quorumpeer 
 * which will be used in all the zookeeperservers
 * instantiated later. Also, it is created once on 
 * bootup and only thrown away in case of a truncate
 * message from the leader
 */
private ZKDatabase zkDb;

It is introduced by ZOOKEEPER-596. Now, we just drop the database every leader 
election.

We can keep it safely with ZOOKEEPER-1549.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1669) Operations to server will be timed-out while thousands of sessions expired same time

2013-03-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608802#comment-13608802
 ] 

Jacky007 commented on ZOOKEEPER-1669:
-

I think it is. In one of our environment, there are tens of thousands 
connections and 300~500/s close session(these clients create a connection for a 
read, and close it immediately). The codes you described significantly affect 
performance.

> Operations to server will be timed-out while thousands of sessions expired 
> same time
> 
>
> Key: ZOOKEEPER-1669
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1669
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.5
>Reporter: tokoot
>  Labels: performance
>
> If there are thousands of clients, and most of them disconnect with server 
> same time(client restarted or servers partitioned with clients), the server 
> will busy to close those "connections" and become unavailable. The problem is 
> in following:
>   private void closeSessionWithoutWakeup(long sessionId) {
>   HashSet cnxns;
>   synchronized (this.cnxns) {
>   cnxns = (HashSet)this.cnxns.clone();  // other 
> thread will block because of here
>   }
>   ...
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1669) Operations to server will be timed-out while thousands of sessions expired same time

2013-03-15 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603262#comment-13603262
 ] 

Jacky007 commented on ZOOKEEPER-1669:
-

We have paid for this. But the fix is simple, you can hash it when the session 
is created, and find from hash when close it. :)

> Operations to server will be timed-out while thousands of sessions expired 
> same time
> 
>
> Key: ZOOKEEPER-1669
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1669
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.5
>Reporter: tokoot
>  Labels: performance
>
> If there are thousands of clients, and most of them disconnect with server 
> same time(client restarted or servers partitioned with clients), the server 
> will busy to close those "connections" and become unavailable. The problem is 
> in following:
>   private void closeSessionWithoutWakeup(long sessionId) {
>   HashSet cnxns;
>   synchronized (this.cnxns) {
>   cnxns = (HashSet)this.cnxns.clone();  // other 
> thread will block because of here
>   }
>   ...
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Description: 
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

scenario:
1) Client a set a data watch on /d, then disconnect, client b delete /d and 
create it again. When client a reestablish to zk, it will receive a NodeCreated 
rather than a NodeDataChanged.
2) Client a set a exists watch on /e(not exist), then disconnect, client b 
create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
rather than a NodeCreated.



  was:
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

scenario:
1) Client a set a data watch on /d, then disconnect, client b delete /d and 
create it again. When client a reestablish to zk, it will receive a NodeCreated.
2) Client a set a exists watch on /e(not exist), then disconnect, client b 
create /e. When client a reestablish to zk, it will receive a NodeDataChanged.




> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated rather than a NodeDataChanged.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged 
> rather than a NodeCreated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Description: 
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

scenario:
1) Client a set a data watch on /d, then disconnect, client b delete /d and 
create it again. When client a reestablish to zk, it will receive a NodeCreated.
2) Client a set a exists watch on /e(not exist), then disconnect, client b 
create /e. When client a reestablish to zk, it will receive a NodeDataChanged.



  was:
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

scenario:
1) Client a set a data watch on /d, then disconnect, client b delete /d and 
create it again. When client a reestablish to zk, it will receive a NodeCreated.
2) Client a set a exists watch on /e(not exist), then disconnect, client b 
create /d. When client a reestablish to zk, it will receive a NodeDataChanged.




> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /e. When client a reestablish to zk, it will receive a NodeDataChanged.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Description: 
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

scenario:
1) Client a set a data watch on /d, then disconnect, client b delete /d and 
create it again. When client a reestablish to zk, it will receive a NodeCreated.
2) Client a set a exists watch on /e(not exist), then disconnect, client b 
create /d. When client a reestablish to zk, it will receive a NodeDataChanged.



  was:
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)


> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)
> scenario:
> 1) Client a set a data watch on /d, then disconnect, client b delete /d and 
> create it again. When client a reestablish to zk, it will receive a 
> NodeCreated.
> 2) Client a set a exists watch on /e(not exist), then disconnect, client b 
> create /d. When client a reestablish to zk, it will receive a NodeDataChanged.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event isn't handled correctly when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Summary: Watch event isn't handled correctly when a client reestablish to a 
server  (was: Watch event is handled incorrectly when a client reestablish to a 
server)

> Watch event isn't handled correctly when a client reestablish to a server
> -
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event is handled incorrectly when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Summary: Watch event is handled incorrectly when a client reestablish to a 
server  (was: Watch event is handled incorrect when a client reestablish to a 
server)

> Watch event is handled incorrectly when a client reestablish to a server
> 
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event is handled incorrect when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Attachment: ZOOKEEPER-1667.patch

> Watch event is handled incorrect when a client reestablish to a server
> --
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1667.patch
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1667) Watch event is handled incorrect when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1667:


Description: 
When a client reestablish to a server, it will send the watches which have not 
been triggered. But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

  was:
When a client reestablish to a server, it will send the watches already set. 
But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)


> Watch event is handled incorrect when a client reestablish to a server
> --
>
> Key: ZOOKEEPER-1667
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.6, 3.4.5
>Reporter: Jacky007
>Priority: Blocker
> Fix For: 3.5.0
>
>
> When a client reestablish to a server, it will send the watches which have 
> not been triggered. But the code in DataTree does not handle it correctly.
> It is obvious, we just do not notice it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1667) Watch event is handled incorrect when a client reestablish to a server

2013-03-14 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1667:
---

 Summary: Watch event is handled incorrect when a client 
reestablish to a server
 Key: ZOOKEEPER-1667
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1667
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5, 3.3.6
Reporter: Jacky007
Priority: Blocker
 Fix For: 3.5.0


When a client reestablish to a server, it will send the watches already set. 
But the code in DataTree does not handle it correctly.

It is obvious, we just do not notice it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2013-02-05 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571162#comment-13571162
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

[~thawan] It looks great. A question not related to this patch, why zk would 
choose to clear data base and load snapshot again rather than keep it in memory 
and reuse again when learner or leader fails?

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Assignee: Thawan Kooburat
>Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: case.patch, ZOOKEEPER-1549-3.4.patch, 
> ZOOKEEPER-1549-learner.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1608) Add support for key-value store as optional storage engine

2013-01-24 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561585#comment-13561585
 ] 

Jacky007 commented on ZOOKEEPER-1608:
-

@Thawan Kooburat It's a great idea. In one of our enviroment, the size of the 
snapshot is ~10g. We use consistent cache in the client, reading arrived to zk 
and writing are rarely. The only thing need to worry about is that it may take 
several minutes to recover from a leader failure (a load and write).

Please let me know how can I help on this?

> Add support for key-value store as optional storage engine
> --
>
> Key: ZOOKEEPER-1608
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1608
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3
>Reporter: Thawan Kooburat
>
> Problem:
> 1. ZooKeeper need to load the entire dataset into its memory. So the total 
> data size and number of znode are limited by the amount of available memory.
> 2. We want to minimize ZooKeeper down time, but found that it is bound by 
> snapshot loading and writing time. The bigger the database, the longer it 
> take for the system to recover. The worst case is that if the data size grow 
> too large and initLimit wasn't update accordingly, the quorum won't form 
> after failure.  
> Implementation: (still work in progress)
> 1. Create a new type of DataTree that supported key-value storage as backing 
> store. Our current candidate backing store is Oracle's Berkeley DB Java 
> Edition
> 2. There is no need to use snapshot facility for this type of DataTree. Since 
> doing a sync write of lastProcessedZxid into the backing store is the same as 
> taking a snapshot. However, the system still use txnlog as before. The system 
> can be considered as having only a single snapshot. It has to rely on backing 
> store to detect data corruption and recovery.  
> 3. There is no need to do any per-node locking. CommitProcessor 
> (ZOOKEEPER-1505) prevents concurrent read and write to reach the DataTree. 
> The DataTree is also accessed by PrepRequestProcessor (to create 
> ChangeRecord), but I believe that read and write to the same znode cannot 
> happens concurrently.
> 4. There are 3 types of data which is required to be persisted in backing 
> store: ACLs, znodes and sessions. However, we also store other data reduce 
> oDataTree initialization time or serialization cost such as list of node's 
> children and list of ephemeral node. 
> 5. Each Zookeeper's txn may translate into multiple actions on the DataTree. 
> For example, creating a node may result in  AddingZNODE, AddingChildren and 
> AddingEphemeralNode. However, as a long as these operations are idempotent, 
> there is no need to group them into a transaction. So txns can be replayed on 
> DataTree without corrupting the data. This also means that the system don't 
> need key-value store that support transaction semantic. Currently, only 
> operations related to quota break this assumption because it use increment 
> operation.
> 6. SNAP protocol is supported so the ensemble can be upgraded online. In the 
> future we may add extend SNAP protocol to send raw data file in order to save 
> CPU cost when sending large database.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1127) Auth completion are called for every registered auth, and auths are never removed from the auth list. (even after they are processed).

2012-12-25 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539487#comment-13539487
 ] 

Jacky007 commented on ZOOKEEPER-1127:
-

It's not a problem. We should close the issue to avoid misleading.

> Auth completion are called for every registered auth, and auths are never 
> removed from the auth list. (even after they are processed).
> --
>
> Key: ZOOKEEPER-1127
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1127
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.3
>Reporter: Dheeraj Agrawal
>Priority: Critical
>
> When we get a auth response, every time we process any auth_response, we call 
> ALL the auth completions (might be registered by different add_auth_info 
> calls). we should be calling only the one that the request came from? I guess 
> we dont know for which request the response corresponds to? If the requests 
> are processed in FIFO and response are got in order then may be we can figure 
> out which add_auth info request the response corresponds to.
> Also , we never remove entries from the auth_list
> Also the logging is misleading. 
> 
>   1206 if (rc) {
>1207 LOG_ERROR(("Authentication scheme %s failed. Connection 
> closed.",
>1208zh->auth_h.auth->scheme));
>1209 }
>1210 else {
>1211 LOG_INFO(("Authentication scheme %s succeeded", 
> zh->auth_h.auth->scheme));
> 
> If there are multiple auth_info in the auth_list , we always print 
> success/failure for ONLY the first one. So if I had two auths for scehmes, 
> ABCD and EFGH and my auth scheme EFGH failed, the logs will still say ABCD 
> failed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1147) Add support for local sessions

2012-12-25 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539402#comment-13539402
 ] 

Jacky007 commented on ZOOKEEPER-1147:
-

Did I misunderstanding your description, please let me know [~thawan]

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

2012-12-23 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539176#comment-13539176
 ] 

Jacky007 commented on ZOOKEEPER-1561:
-

It was fixed in ZOOKEEPER-1560.

> Zookeeper client may hang on a server restart
> -
>
> Key: ZOOKEEPER-1561
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.0
>Reporter: Jacky007
> Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
> outgoingQueue.removeFirstOccurrence(p);
> updateLastSend();
> if ((p.requestHeader != null) &&
> (p.requestHeader.getType() != OpCode.ping) &&
> (p.requestHeader.getType() != OpCode.auth)) {
> p.requestHeader.setXid(cnxn.getXid());
> }
> p.createBB();
> ByteBuffer pbb = p.bb;
> sock.write(pbb);
> if (!pbb.hasRemaining()) {
> sentCount++;
> if (p.requestHeader != null
> && p.requestHeader.getType() != OpCode.ping
> && p.requestHeader.getType() != OpCode.auth) {
> pending.add(p);
> }
> }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be 
> cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, 
> it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

2012-12-23 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 resolved ZOOKEEPER-1561.
-

  Resolution: Duplicate
Release Note: It is fixed in ZOOKEEPER-1560.

> Zookeeper client may hang on a server restart
> -
>
> Key: ZOOKEEPER-1561
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.0
>Reporter: Jacky007
> Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
> outgoingQueue.removeFirstOccurrence(p);
> updateLastSend();
> if ((p.requestHeader != null) &&
> (p.requestHeader.getType() != OpCode.ping) &&
> (p.requestHeader.getType() != OpCode.auth)) {
> p.requestHeader.setXid(cnxn.getXid());
> }
> p.createBB();
> ByteBuffer pbb = p.bb;
> sock.write(pbb);
> if (!pbb.hasRemaining()) {
> sentCount++;
> if (p.requestHeader != null
> && p.requestHeader.getType() != OpCode.ping
> && p.requestHeader.getType() != OpCode.auth) {
> pending.add(p);
> }
> }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be 
> cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, 
> it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1147) Add support for local sessions

2012-12-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537719#comment-13537719
 ] 

Jacky007 commented on ZOOKEEPER-1147:
-

Yeah, this is the best way to implement this. 
{quote}
1. There is no new client-side interface. All session are created as local 
session by default (no CreateSession request is sent to the leader). 
2. When client try to create an ephemeral node. The follower/leader will 
upgrade this session to a global session by issuing CreateSession request 
before issuing create ephemeral node. 
3. The client retains the same sessionId when upgrading from local to global 
session. Each server use serverId as sessionId prefix.
{quote}
But I think it has the problem I have mentioned.
{quote}
since that ephemeral node will eventually get removed after the session timeout
{quote}

ephemeral node /eph_1 will get removed finally, but for A, it will get a 
session expire firstly. It is different form the original semantics which says 
when I get a session expire the ephemeral nodes have been removed already.


> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-12-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537709#comment-13537709
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

I think Flavio says is the same to Thawan Kooburat.
{quote}
Before leader start serving request, there is no need for the leader to replay 
its txnlog at all. It only need to know the latest zxid in order to be elected 
as a leader and synchronize with the follower. The leader need to send 
uncommitted txns before sending NEWLEADER packet. Even if a follower request a 
snapshot, it won't get uncommitted txn as part of the snapshot. Then, the 
follower need to log the steam of txns between DIFF/SNAP and NEWLEADER to disk 
before sending ACK back to the leader. When the leader receive majority of ACK, 
then it can apply the uncommitted txn and start serving request.
{quote}
The problem is the leader can only send a oold snapshot to the follower, 
related to snapShot. or we should save lastProcessedZxid.


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-12-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537708#comment-13537708
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

{quote}

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-12-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537706#comment-13537706
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

The problem is when follower receives a snapshot from leader, it must ack to 
leader after taks a snapshot(no txnlog).

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1147) Add support for local sessions

2012-12-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536939#comment-13536939
 ] 

Jacky007 commented on ZOOKEEPER-1147:
-

[quote]
1. zoo_init will take a flag indicating delayed persistent session creation.
2. Server will look at this flag and create a session that is local to the 
server and not send a request to the leader.
3. Server will expose a new operation - upgradeToPersistent - that will upgrade 
a local session to a persistent session. This is the first time that the leader 
will become aware of this session (assuming the client is connected to a 
follower)
4. If there is a zoo_create with ephemeral node, the client will send a 
upgradeToPersistent request to the server before sending the create ephemeral 
node request. This request would be async, so I don't expect it to delay the 
creation of ephemeral node much.
[quote]

There is a problem I can see.
If A is a local session, now A wants to create a ephemeral node. A sends 
upgradeToPersistent async and create /eph_1 to server 1, then A is disconnected 
with server 1, and try to renew session on server 2. If leader receives the 
message as follows: renew,  upgradeToPersistent, create /eph_1,then A will 
first get a session timeout, and finally /eph_1 is created which is unexpected. 

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1147) Add support for local sessions

2012-12-17 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534673#comment-13534673
 ] 

Jacky007 commented on ZOOKEEPER-1147:
-

This is great, create session is costly in zookeeper.
if there is any progress on the feature.

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1493) C Client: zookeeper_process doesn't invoke completion callback if zookeeper_close has been called

2012-11-21 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501829#comment-13501829
 ] 

Jacky007 commented on ZOOKEEPER-1493:
-

a memory leak for missing free_buffer(bptr)

> C Client: zookeeper_process doesn't invoke completion callback if 
> zookeeper_close has been called
> -
>
> Key: ZOOKEEPER-1493
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1493
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3, 3.3.5
>Reporter: Michi Mutsuzaki
>Assignee: Michi Mutsuzaki
> Fix For: 3.3.6, 3.4.4, 3.5.0
>
> Attachments: ZOOKEEPER-1493_3_3.patch, ZOOKEEPER-1493_3_4.patch, 
> ZOOKEEPER-1493.patch
>
>
> In ZOOKEEPER-804, we added a check in zookeeper_process() to see if 
> zookeeper_close() has been called. This was to avoid calling assert(cptr) on 
> a NULL pointer, as dequeue_completion() returns NULL if the sent_requests 
> queue has been cleared by free_completion() from zookeeper_close(). However, 
> we should still call the completion if it is not NULL. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1493) C Client: zookeeper_process doesn't invoke completion callback if zookeeper_close has been called

2012-11-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501129#comment-13501129
 ] 

Jacky007 commented on ZOOKEEPER-1493:
-

this patch also solve a potential hang int zookeeper_close() :-)

> C Client: zookeeper_process doesn't invoke completion callback if 
> zookeeper_close has been called
> -
>
> Key: ZOOKEEPER-1493
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1493
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3, 3.3.5
>Reporter: Michi Mutsuzaki
>Assignee: Michi Mutsuzaki
> Fix For: 3.3.6, 3.4.4, 3.5.0
>
> Attachments: ZOOKEEPER-1493_3_3.patch, ZOOKEEPER-1493_3_4.patch, 
> ZOOKEEPER-1493.patch
>
>
> In ZOOKEEPER-804, we added a check in zookeeper_process() to see if 
> zookeeper_close() has been called. This was to avoid calling assert(cptr) on 
> a NULL pointer, as dequeue_completion() returns NULL if the sent_requests 
> queue has been cleared by free_completion() from zookeeper_close(). However, 
> we should still call the completion if it is not NULL. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1493) C Client: zookeeper_process doesn't invoke completion callback if zookeeper_close has been called

2012-11-20 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501128#comment-13501128
 ] 

Jacky007 commented on ZOOKEEPER-1493:
-

this patch also solve a potential hang int zookeeper_close() :-)

> C Client: zookeeper_process doesn't invoke completion callback if 
> zookeeper_close has been called
> -
>
> Key: ZOOKEEPER-1493
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1493
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3, 3.3.5
>Reporter: Michi Mutsuzaki
>Assignee: Michi Mutsuzaki
> Fix For: 3.3.6, 3.4.4, 3.5.0
>
> Attachments: ZOOKEEPER-1493_3_3.patch, ZOOKEEPER-1493_3_4.patch, 
> ZOOKEEPER-1493.patch
>
>
> In ZOOKEEPER-804, we added a check in zookeeper_process() to see if 
> zookeeper_close() has been called. This was to avoid calling assert(cptr) on 
> a NULL pointer, as dequeue_completion() returns NULL if the sent_requests 
> queue has been cleared by free_completion() from zookeeper_close(). However, 
> we should still call the completion if it is not NULL. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474981#comment-13474981
 ] 

Jacky007 commented on ZOOKEEPER-1560:
-

I think this would work for both 1560 and 1561.
{noformat}
 if (p != null) {
updateLastSend();
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
ByteBuffer pbb = p.bb;
  --->   while (pbb.hasRemaining()) sock.write(pbb);
  --->   outgoingQueue.removeFirstOccurrence(p);
sentCount++;
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
pending.add(p);
}
   }
{noformat}

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-10-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474737#comment-13474737
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

A B C D E, A propose v, B C accept it, then A B died. C is elected as the new 
leader. We cannot discard the proposal v.

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-107) Allow dynamic changes to server cluster membership

2012-10-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473926#comment-13473926
 ] 

Jacky007 commented on ZOOKEEPER-107:


opened an new jira ZOOKEEPER-1561 for it.

> Allow dynamic changes to server cluster membership
> --
>
> Key: ZOOKEEPER-107
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: SimpleAddition.rtf, zkreconfig-usenixatc-final.pdf, 
> ZOOKEEPER-107-1-Mar.patch, ZOOKEEPER-107-20-July.patch, 
> ZOOKEEPER-107-21-July.patch, ZOOKEEPER-107-22-Apr.patch, 
> ZOOKEEPER-107-23-SEP.patch, ZOOKEEPER-107-28-Feb.patch, 
> ZOOKEEPER-107-28-Feb.patch, ZOOKEEPER-107-29-Feb.patch, 
> ZOOKEEPER-107-3-Oct.patch, ZOOKEEPER-107-Aug-20.patch, 
> ZOOKEEPER-107-Aug-20-ver1.patch, ZOOKEEPER-107-Aug-25.patch, 
> zookeeper-3.4.0.jar, zookeeper-dev-fatjar.jar, 
> zookeeper-reconfig-sep11.patch, zookeeper-reconfig-sep12.patch, 
> zoo_replicated1.cfg, zoo_replicated1.members
>
>
> Currently cluster membership is statically defined, adding/removing hosts 
> to/from the server cluster dynamically needs to be supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-107) Allow dynamic changes to server cluster membership

2012-10-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473924#comment-13473924
 ] 

Jacky007 commented on ZOOKEEPER-107:


{quote}
Do the reconfiguration tests pass for you when this bug is solved ? I wonder 
why 
of all tests the reconfiguration tests were the ones hitting this bug... maybe 
because of 
the many leader/follower shutdowns that they do, not sure.
{quote}

Yes, all tests pass here.

If the server is restarting when we send a getReconfig(getData) request,  the 
bug will happen.

> Allow dynamic changes to server cluster membership
> --
>
> Key: ZOOKEEPER-107
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: SimpleAddition.rtf, zkreconfig-usenixatc-final.pdf, 
> ZOOKEEPER-107-1-Mar.patch, ZOOKEEPER-107-20-July.patch, 
> ZOOKEEPER-107-21-July.patch, ZOOKEEPER-107-22-Apr.patch, 
> ZOOKEEPER-107-23-SEP.patch, ZOOKEEPER-107-28-Feb.patch, 
> ZOOKEEPER-107-28-Feb.patch, ZOOKEEPER-107-29-Feb.patch, 
> ZOOKEEPER-107-3-Oct.patch, ZOOKEEPER-107-Aug-20.patch, 
> ZOOKEEPER-107-Aug-20-ver1.patch, ZOOKEEPER-107-Aug-25.patch, 
> zookeeper-3.4.0.jar, zookeeper-dev-fatjar.jar, 
> zookeeper-reconfig-sep11.patch, zookeeper-reconfig-sep12.patch, 
> zoo_replicated1.cfg, zoo_replicated1.members
>
>
> Currently cluster membership is statically defined, adding/removing hosts 
> to/from the server cluster dynamically needs to be supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

2012-10-11 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1561:
---

 Summary: Zookeeper client may hang on a server restart
 Key: ZOOKEEPER-1561
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.5.0
Reporter: Jacky007
 Fix For: 3.5.0


In the doIO method of ClientCnxnSocketNIO
{noformat}
 if (p != null) {
outgoingQueue.removeFirstOccurrence(p);
updateLastSend();
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
ByteBuffer pbb = p.bb;
sock.write(pbb);
if (!pbb.hasRemaining()) {
sentCount++;
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
pending.add(p);
}
}
{noformat}
When the sock.write(pbb) method throws an exception, the packet will not be 
cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, 
it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-10-11 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473912#comment-13473912
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

{quote}
Then, we just need to truncate the leader's txn log to that zxid before 
starting it.
{quote}
We can not do that for consistency issues.

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-107) Allow dynamic changes to server cluster membership

2012-10-10 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473117#comment-13473117
 ] 

Jacky007 commented on ZOOKEEPER-107:


Hi Alexander Shraer, it is a bug of the java client.
In the doIO method of ClientCnxnSocketNIO
{noformat}
 if (p != null) {
outgoingQueue.removeFirstOccurrence(p);
updateLastSend();
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
ByteBuffer pbb = p.bb;
sock.write(pbb);
if (!pbb.hasRemaining()) {
sentCount++;
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
pending.add(p);
}
}
{noformat}

Remove packet before the sock.write() line will cause some request wait 
forever. We should do that after the sock.write() line.
Should we open a new jira for this?

> Allow dynamic changes to server cluster membership
> --
>
> Key: ZOOKEEPER-107
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: SimpleAddition.rtf, zkreconfig-usenixatc-final.pdf, 
> ZOOKEEPER-107-1-Mar.patch, ZOOKEEPER-107-20-July.patch, 
> ZOOKEEPER-107-21-July.patch, ZOOKEEPER-107-22-Apr.patch, 
> ZOOKEEPER-107-23-SEP.patch, ZOOKEEPER-107-28-Feb.patch, 
> ZOOKEEPER-107-28-Feb.patch, ZOOKEEPER-107-29-Feb.patch, 
> ZOOKEEPER-107-3-Oct.patch, ZOOKEEPER-107-Aug-20.patch, 
> ZOOKEEPER-107-Aug-20-ver1.patch, ZOOKEEPER-107-Aug-25.patch, 
> zookeeper-3.4.0.jar, zookeeper-dev-fatjar.jar, 
> zookeeper-reconfig-sep11.patch, zookeeper-reconfig-sep12.patch, 
> zoo_replicated1.cfg, zoo_replicated1.members
>
>
> Currently cluster membership is statically defined, adding/removing hosts 
> to/from the server cluster dynamically needs to be supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-107) Allow dynamic changes to server cluster membership

2012-10-08 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472153#comment-13472153
 ] 

Jacky007 commented on ZOOKEEPER-107:


When this happens, the servers are inconsistent. 
{noformat}
~$echo srvr | nc 127.0.0.1 11238
Zookeeper version: 3.5.0-1395456, built on 10/08/2012 06:12 GMT
Latency min/avg/max: 0/0/1
Received: 25
Sent: 24
Connections: 2
Outstanding: 0
Zxid: 0x2
Mode: leader
Node count: 6
{noformat}
{noformat}
~$echo srvr | nc 127.0.0.1 11241
Zookeeper version: 3.5.0-1395456, built on 10/08/2012 06:12 GMT
Latency min/avg/max: 0/0/1
Received: 25
Sent: 24
Connections: 2
Outstanding: 0
Zxid: 0x1000b
Mode: follower
Node count: 6
{noformat}
{noformat}
~$echo srvr | nc 127.0.0.1 11244
Zookeeper version: 3.5.0-1395456, built on 10/08/2012 06:12 GMT
Latency min/avg/max: 0/0/1
Received: 25
Sent: 24
Connections: 2
Outstanding: 0
Zxid: 0x1000b
Mode: follower
Node count: 6
{noformat}

> Allow dynamic changes to server cluster membership
> --
>
> Key: ZOOKEEPER-107
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: SimpleAddition.rtf, zkreconfig-usenixatc-final.pdf, 
> ZOOKEEPER-107-1-Mar.patch, ZOOKEEPER-107-20-July.patch, 
> ZOOKEEPER-107-21-July.patch, ZOOKEEPER-107-22-Apr.patch, 
> ZOOKEEPER-107-23-SEP.patch, ZOOKEEPER-107-28-Feb.patch, 
> ZOOKEEPER-107-28-Feb.patch, ZOOKEEPER-107-29-Feb.patch, 
> ZOOKEEPER-107-3-Oct.patch, ZOOKEEPER-107-Aug-20.patch, 
> ZOOKEEPER-107-Aug-20-ver1.patch, ZOOKEEPER-107-Aug-25.patch, 
> zookeeper-3.4.0.jar, zookeeper-dev-fatjar.jar, 
> zookeeper-reconfig-sep11.patch, zookeeper-reconfig-sep12.patch, 
> zoo_replicated1.cfg, zoo_replicated1.members
>
>
> Currently cluster membership is statically defined, adding/removing hosts 
> to/from the server cluster dynamically needs to be supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-107) Allow dynamic changes to server cluster membership

2012-10-08 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472151#comment-13472151
 ] 

Jacky007 commented on ZOOKEEPER-107:


The result is same in my environment。As I see,
In testRemoveAddOne method
{noformat}
String configStr = reconfig(zk1, null, leavingServers, null, -1);
testServerHasConfig(zk2, null, leavingServers);
testNormalOperation(zk2, zk1);
{noformat}
When i = 1 in the for loop, it is probability that testServerHasConfig hang at 
zk.getConfig(false, new Stat()).
It is very strange to understand.


> Allow dynamic changes to server cluster membership
> --
>
> Key: ZOOKEEPER-107
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: SimpleAddition.rtf, zkreconfig-usenixatc-final.pdf, 
> ZOOKEEPER-107-1-Mar.patch, ZOOKEEPER-107-20-July.patch, 
> ZOOKEEPER-107-21-July.patch, ZOOKEEPER-107-22-Apr.patch, 
> ZOOKEEPER-107-23-SEP.patch, ZOOKEEPER-107-28-Feb.patch, 
> ZOOKEEPER-107-28-Feb.patch, ZOOKEEPER-107-29-Feb.patch, 
> ZOOKEEPER-107-3-Oct.patch, ZOOKEEPER-107-Aug-20.patch, 
> ZOOKEEPER-107-Aug-20-ver1.patch, ZOOKEEPER-107-Aug-25.patch, 
> zookeeper-3.4.0.jar, zookeeper-dev-fatjar.jar, 
> zookeeper-reconfig-sep11.patch, zookeeper-reconfig-sep12.patch, 
> zoo_replicated1.cfg, zoo_replicated1.members
>
>
> Currently cluster membership is statically defined, adding/removing hosts 
> to/from the server cluster dynamically needs to be supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1559) Learner should not snapshot uncommitted state

2012-10-08 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472070#comment-13472070
 ] 

Jacky007 commented on ZOOKEEPER-1559:
-

{quote}
Base on the current implementation, the leader start up and treat every txns in 
its txnlog as committed. It add those txn into committedLog and the follower 
get proposal/commit packet pair for those txns as part of the sync-up. Let me 
know what you think, this problem is quite complicate if we don't want to break 
compatibility.
{quote}
agree.

We have some discussions in ZOOKEEPER-1549. I think it is quite complicate even 
if we don't consider the compatibility.

> Learner should not snapshot uncommitted state
> -
>
> Key: ZOOKEEPER-1559
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1559
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: quorum
>Reporter: Flavio Junqueira
>
> The code in Learner.java is a bit entangled for backward compatibility 
> reasons. We need to make sure that we can remove the calls to take a snapshot 
> without breaking it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-10-07 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471397#comment-13471397
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

Hi Flavio Junqueira, I'm sorry, I had a long holiday and didn't reply. 
https://issues.apache.org/jira/browse/ZOOKEEPER-1558 looks good to me.
But https://issues.apache.org/jira/browse/ZOOKEEPER-1559 seems rather 
complicated. If we move the zk.takeSnapshot() method to UPTODATE, it will break 
Zab1.0 Phase2.4. 

Learner should not snapshot uncommitted state before it have a  quorum support, 
but if Learner does not snapshot, the uncommitted state will never have a 
quorum support since it does not have persistent storage. If Learner does, then 
it need some features as deleting snapshot unwanted, then we go back to the 
origin.

Any ideas?



> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-28 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466111#comment-13466111
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

Sorry, the code is from 3.3.x branch. Could you tell me which jira is for the 
above change?

In 3.4.3, taking snapshot when the prospective follower receives a NEWLEADER 
message sounds not right.
I think we should do that the prospective follower receives a UPTODATE message, 
since data has quorum support yet.

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-27 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464555#comment-13464555
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

Proposal which have quorum support will be wrote to memory database normally. 
When a server start, all proposals will be write to memory(this may be the root 
of the problem), correctness is based on the hypothesis that those proposals 
don't have quorum support will be truncated before the server provides service.

If we can not guarantee that all data in memory has quorum support in the 
start-up procedure, we should guarantee that taking a dirty snapshot will never 
happen.

Which means taking a snapshot when the prospective follower receiving a 
UPTODATE message is also wrong.

When the prospective leader has a quorum set ACK for UPTODATE, it can take a 
snapshot. But the prospective follow will never know such things in the 
start-up procedure.

So the prospective follower can not take a snapshot here.

{noformat}
case Leader.UPTODATE:
zk.takeSnapshot();
{noformat}

 Is it really indispensability here?

What is the motivation of taking a snapshot in the lead() method?


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-26 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463720#comment-13463720
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

Flavio Junqueira, any suggestion to this?

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-13 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Attachment: case.patch

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-13 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Attachment: (was: case.patch)

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-13 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Attachment: case.patch

a testcase

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-12 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454682#comment-13454682
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

Thanks for Falvio to comment. I agree with Falvio, deleting a snapshot is too 
dangerous to be an optional. There should be another solution.
Step 1
74 is in A's transaction logs.
Step 2
A is the new leader, and it will execute the following code.
{noformat}
void lead() throws IOException, InterruptedException {
self.end_fle = System.currentTimeMillis();
LOG.info("LEADING - LEADER ELECTION TOOK - " +
  (self.end_fle - self.start_fle));
self.start_fle = 0;
self.end_fle = 0;

zk.registerJMX(new LeaderBean(this, zk), self.jmxLocalPeerBean);

try {
self.tick = 0;
zk.loadData();
{noformat}
Then A will load its snapshot and committedlog.
{noformat}
public void loadData() throws IOException, InterruptedException {
setZxid(zkDb.loadDataBase());
// Clean up dead sessions
LinkedList deadSessions = new LinkedList();
for (Long session : zkDb.getSessions()) {
if (zkDb.getSessionWithTimeOuts().get(session) == null) {
deadSessions.add(session);
}
}
zkDb.setDataTreeInit(true);
for (long session : deadSessions) {
// XXX: Is lastProcessedZxid really the best thing to use?
killSession(session, zkDb.getDataTreeLastProcessedZxid());
}

// Make a clean snapshot
takeSnapshot();
}
{noformat}
when A takeSnapshot(), 74 is in it(if A dies after that, B will never know it). 
When A load database,
{noformat}
public void loadData() throws IOException, InterruptedException {
setZxid(zkDb.loadDataBase());
{noformat}
it will restore database from snapshots and transaction logs,
{noformat}
long zxid = snapLog.restore(dataTree,sessionsWithTimeouts,listener);
{noformat}
{noformat}
try {
processTransaction(hdr,dt,sessions, itr.getTxn());
} catch(KeeperException.NoNodeException e) {
   throw new IOException("Failed to process transaction type: " +
 hdr.getType() + " error: " + e.getMessage(), e);
}
listener.onTxnLoaded(hdr, itr.getTxn());
{noformat}
but 74 is in A's transaction logs. 

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-12 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453882#comment-13453882
 ] 

Jacky007 commented on ZOOKEEPER-1549:
-

I've written a test case to reproduce it. I think we could delete the snapshot 
larger than the truncate zxid to solve the problem.

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: 
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
the leader will send a snapshot to follower, it will not be a problem.

  was:
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

in 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
the leader will send a snapshot to follower, it will not be a problem.


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to i

[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: 
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

in 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
the leader will send a snapshot to follower, it will not be a problem.

  was:
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

in 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874)


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provid

[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: 
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

in 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874)

  was:
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, 

[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Affects Version/s: (was: 3.3.6)

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Priority: Critical  (was: Major)

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: 
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 2
A,B restart, A is elected as the new leader,  and A will load data and take a 
clean snapshot(change 74 is in it), then send diff to B, but B died before sync 
with A. A died later.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. 
Problem:
The problem with the above sequence is that after truncate the log, A will load 
the snapshot again which is not correct.

  was:
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. B will send to A the proposal 
81.
Problem:
The problem with the above sequence is that A's data tree has the update from 
request 74, which is not correct. Before getting the proposals 81, A should 
have received a trunc to 73. I don't see that in the code. If the maxCommitLog 
on B hadn't bumped to 81 but had stayed at 73, that case seems to be fine.


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3, 3.3.6
>Reporter: Jacky007
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.

--
This mes

[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: 
the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
not correct.
here is scenario(similar to 1154):
Initial Condition
1.  Lets say there are three nodes in the ensemble A,B,C with A being the 
leader
2.  The current epoch is 7. 
3.  For simplicity of the example, lets say zxid is a two digit number, 
with epoch being the first digit.
4.  The zxid is 73
5.  All the nodes have seen the change 73 and have persistently logged it.
Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is 
a crash of the entire ensemble and B,C never write the change 74 to their log.
Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog 
is 81
Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory 
data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. B will send to A the proposal 
81.
Problem:
The problem with the above sequence is that A's data tree has the update from 
request 74, which is not correct. Before getting the proposals 81, A should 
have received a trunc to 73. I don't see that in the code. If the maxCommitLog 
on B hadn't bumped to 81 but had stayed at 73, that case seems to be fine.

  was:ZOOKEEPER-1154


> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3, 3.3.6
>Reporter: Jacky007
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. B will send to A the proposal 
> 81.
> Problem:
> The problem with the above sequence is that A's data tree has the update from 
> request 74, which is not correct. Before getting the proposals 81, A should 
> have received a trunc to 73. I don't see that in the code. If the 
> maxCommitLog on B hadn't bumped to 81 but had stayed at 73, that case seems 
> to be fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky007 updated ZOOKEEPER-1549:


Description: ZOOKEEPER-1154

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3, 3.3.6
>Reporter: Jacky007
>
> ZOOKEEPER-1154

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-09-10 Thread Jacky007 (JIRA)
Jacky007 created ZOOKEEPER-1549:
---

 Summary: Data inconsistency when follower is receiving a DIFF with 
a dirty snapshot
 Key: ZOOKEEPER-1549
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.6, 3.4.3
Reporter: Jacky007




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira