[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164948#comment-13164948
 ] 

Flavio Junqueira commented on ZOOKEEPER-1319:
---------------------------------------------

+1, looks good, Pat. About the double occurrence of NEWLEADER, it happens 
because we insert NEWLEADER in outstandingRequests in Leader.lead() and queue a 
NEWLEADER message in LearnerHandler.run(). When we execute 
LearnerHandler.startForwarding() from LearnerHandler.run(), we queue the 
packets in outstandingRequests, including NEWLEADER. 

It is not necessary to send it again in startForwarding(), but we do need it in 
outstandingRequests to collect acks. Since we have to add it to 
outstandingRequests, one simple way to avoid it is by performing a check like 
this in startForwarding:

{noformat}
                if(outstandingProposals.get(zxid).packet.getType() != 
NEWLEADER){
                    handler.queuePacket(outstandingProposals.get(zxid).packet);
                }

{noformat}

I have verified that by including this check, I can remove the double 
occurrence of NEWLEADER in Pat's patch and the test passes. We may want to 
consider this check in some later release.
                
> Missing data after restarting+expanding a cluster
> -------------------------------------------------
>
>                 Key: ZOOKEEPER-1319
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1319
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.0
>         Environment: Linux (Debian Squeeze)
>            Reporter: Jeremy Stribling
>            Assignee: Patrick Hunt
>            Priority: Blocker
>              Labels: cluster, data
>             Fix For: 3.5.0, 3.4.1
>
>         Attachments: ZOOKEEPER-1319.patch, ZOOKEEPER-1319.patch, 
> ZOOKEEPER-1319_trunk.patch, logs.tgz
>
>
> I've been trying to update to ZK 3.4.0 and have had some issues where some 
> data become inaccessible after adding a node to a cluster.  My use case is a 
> bit strange (as explained before on this list) in that I try to grow the 
> cluster dynamically by having an external program automatically restart 
> Zookeeper servers in a controlled way whenever the list of participating ZK 
> servers needs to change.  This used to work just fine in 3.3.3 (and before), 
> so this represents a regression.
> The scenario I see is this:
> 1) Start up a 1-server ZK cluster (the server has ZK ID 0).
> 2) A client connects to the server, and makes a bunch of znodes, in 
> particular a znode called "/membership".
> 3) Shut down the cluster.
> 4) Bring up a 2-server ZK cluster, including the original server 0 with its 
> existing data, and a new server with ZK ID 1.
> 5) Node 0 has the highest zxid and is elected leader.
> 6) A client connecting to server 1 tries to "get /membership" and gets back a 
> -101 error code (no such znode).
> 7) The same client then tries to "create /membership" and gets back a -110 
> error code (znode already exists).
> 8) Clients connecting to server 0 can successfully "get /membership".
> I will attach a tarball with debug logs for both servers, annotating where 
> steps #1 and #4 happen.  You can see that the election involves a proposal 
> for zxid 110 from server 0, but immediately following the election server 1 
> has these lines:
> 2011-12-05 17:18:48,308 9299 [QuorumPeer[myid=1]/127.0.0.1:2901] WARN 
> org.apache.zookeeper.server.quorum.Learner  - Got zxid 0x100000001 expected 
> 0x1
> 2011-12-05 17:18:48,313 9304 [SyncThread:1] INFO 
> org.apache.zookeeper.server.persistence.FileTxnLog  - Creating new log file: 
> log.100000001
> Perhaps that's not relevant, but it struck me as odd.  At the end of server 
> 1's log you can see a repeated cycle of getData->create->getData as the 
> client tries to make sense of the inconsistent responses.
> The other piece of information is that if I try to use the on-disk 
> directories for either of the servers to start a new one-node ZK cluster, all 
> the data are accessible.
> I haven't tried writing a program outside of my application to reproduce 
> this, but I can do it very easily with some of my app's tests if anyone needs 
> more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to