[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010970#comment-13010970
 ] 

Flavio Junqueira commented on ZOOKEEPER-1026:
---------------------------------------------

Right, but what if S3 crashes and recovers? In this case, logicalclock is reset 
and receiving the old message from S2 would cause S3 to lead, as described 
above. If this is right, then is it only possible if we have message duplicates?

> Sequence number assignment decreases after old node rejoins cluster
> -------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1026
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1026
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3
>            Reporter: Jeremy Stribling
>         Attachments: ZOOKEEPER-1026.logs.tgz
>
>
> I ran into a weird case where a Zookeeper server rejoins the cluster after 
> missing several operations, and then a client creates a new sequential node 
> that has a number earlier than the last node it created.  I don't have full 
> logs, or a live system in this state, or any data directories, just some 
> partial server logs and the evidence as seen by the client.  Haven't tried 
> reproducing it yet, just wanted to see if anyone here had any ideas.  Here's 
> the scenario (probably more info than necessary, but trying to be complete)
> 1) Initially (5:37:20): 3 nodes up, with ids 215, 126, and 37 (called nodes 
> #1, #2, and #3 below):
> 2) Nodes periodically (and throughout this whole timeline) create sequential, 
> non-ephemeral nodes under the /zkrsm parent node.
> 3) 5:46:57: Node #1 gets notified of /zkrsm/0000000000000000_record0000002116
> 4) 5:47:06: Node #1 restarts and rejoins
> 5) 5:49:26: Node #2 gets notified of /zkrsm/0000000000000000_record0000002708
> 6) 5:49:29: Node #2 restarts and rejoins
> 7) 5:52:01: Node #3 gets notified of /zkrsm/0000000000000000_record0000003291
> 8) 5:52:02: Node #3 restarts and begins the rejoining process
> 9) 5:52:08: Node #1 successfully creates 
> /zkrsm/0000000000000000_record0000003348
> 10) 5:52:08: Node #2 dies after getting notified of 
> /zkrsm/0000000000000000_record0000003348
> 11) 5:52:10ish: Node #3 is elected leader (the ZK server log doesn't have 
> wallclock timestamps, so not exactly sure on the ordering of this step)
> 12) 5:52:15: Node #1 successfully creates 
> /zkrsm/0000000000000000_record0000003292
> Note that the node created in step #12 is lower than the one created in step 
> #9, and is exactly one greater than the last node seen by node #3 before it 
> restarted.
> Here is the sequence of session establishments as seen from the C client of 
> node #1 after its restart (the IP address of node #1=13.0.0.11, #2=13.0.0.12, 
> #3=13.0.0.13):
> 2011-03-18 05:46:59,838:17454(0x7fc57d3db710):ZOO_INFO@check_events@1632: 
> session establishment complete on server [13.0.0.13:2888], 
> sessionId=0x252ec780a3020000, negotiated timeout=6000
> 2011-03-18 05:49:32,194:17454(0x7fc57cbda710):ZOO_INFO@check_events@1632: 
> session establishment complete on server [13.0.0.13:2888], 
> sessionId=0x252ec782f5100002, negotiated timeout=6000
> 2011-03-18 05:52:02,352:17454(0x7fc57d3db710):ZOO_INFO@check_events@1632: 
> session establishment complete on server [13.0.0.12:2888], 
> sessionId=0x7e2ec782ff5f0001, negotiated timeout=6000
> 2011-03-18 05:52:08,583:17454(0x7fc57d3db710):ZOO_INFO@check_events@1632: 
> session establishment complete on server [13.0.0.11:2888], 
> sessionId=0x7e2ec782ff5f0001, negotiated timeout=6000
> 2011-03-18 05:52:13,834:17454(0x7fc57cbda710):ZOO_INFO@check_events@1632: 
> session establishment complete on server [13.0.0.11:2888], 
> sessionId=0xd72ec7856d0f0001, negotiated timeout=6000
> I will attach logs for all nodes after each of their restarts, and a partial 
> log for node #3 from before its restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to