[jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart

Jeremy Stribling (Commented) (JIRA) Mon, 23 Jan 2012 16:04:09 -0800

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191650#comment-13191650
 ]


Jeremy Stribling commented on ZOOKEEPER-1367:
---------------------------------------------

bq. Hm..... very interesting. What exactly does this mean? You mentioned 
earlier that you "embed Zookeeper into our application framework and set up 
things through code" how exactly are you performing this "restart". Is ZK a 
separate process, are you killing processes, or are you calling some code to 
effect this? I ask because we really don't support this and I'm wondering if 
that could be related.

ZK is embedded into a java process, running alongside some other java apps we 
need to manage.  The reason for this is that dynamic cluster membership is 
absolutely required for our application; we cannot know the IPs/ports/server 
IDs of all of the Zookeeper servers that will exist in the cluster.  So as new 
nodes come online, they connect to a centralized part of our service, and we 
distribute the new list of servers to all the existing servers, so they can 
restart themselves.  By "restart" here, I mean we call QuorumPeer.shutdown (and 
FastleaderElection.shutdown), delete the previous QuorumPeer, and construct a 
new one with the new configuration.

This is the same way we ran things under 3.3.3.  I understand that this is not 
officially supported, but in my heart of hearts I don't believe it is related 
to the bug at hand, so I appreciate your indulgence on the matter.

bq. that's true, but the more variables we can eliminate the more easy it will 
be to track the real issue down.

We are supposed to be running with synced clocks, but QA is trying to track 
down a bug with their system right now to figure out why NTP isn't working in 
their environment.  Sorry for the extra level of confusion.

bq. btw, if you do have QA retest this please do capture all the logs (log4j). 
Unfortunately the two logs don't both show the znode expiring (I see the time 
in question in one but not the other log), that would give much more insight 
into what happened...

I don't quite follow your question.  Do you mean that my logs aren't capturing 
all of the output logged by ZooKeeper?  Or are you asking for the logs from the 
previous run of the system, when the znodes were originally created?  I will 
definitely try to get the latter, but as for the former problem -- this is 
everything that ZooKeeper logged, so I'm not sure what else I can capture 
during the run.

I think the very problem is that one of the nodes didn't expire one of the 
sessions, so I wouldn't expect that to be in the log for that server.  But 
maybe I don't quite understand what you're asking.
                
> Data inconsistencies and unexpired ephemeral nodes after cluster restart
> ------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1367
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1367
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.2
>         Environment: Debian Squeeze, 64-bit
>            Reporter: Jeremy Stribling
>            Priority: Blocker
>             Fix For: 3.4.3
>
>         Attachments: ZOOKEEPER-1367.tgz
>
>
> In one of our tests, we have a cluster of three ZooKeeper servers.  We kill 
> all three, and then restart just two of them.  Sometimes we notice that on 
> one of the restarted servers, ephemeral nodes from previous sessions do not 
> get deleted, while on the other server they do.  We are effectively running 
> 3.4.2, though technically we are running 3.4.1 with the patch manually 
> applied for ZOOKEEPER-1333 and a C client for 3.4.1 with the patches for 
> ZOOKEEPER-1163.
> I noticed that when I connected using zkCli.sh to the first node (90.0.0.221, 
> zkid 84), I saw only one znode in a particular path:
> {quote}
> [zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm
> [nominee0000000011]
> [zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee0000000011
> 90.0.0.222:7777 
> cZxid = 0x400000027
> ctime = Thu Jan 19 08:18:24 UTC 2012
> mZxid = 0x400000027
> mtime = Thu Jan 19 08:18:24 UTC 2012
> pZxid = 0x400000027
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0xa234f4f3bc220001
> dataLength = 16
> numChildren = 0
> {quote}
> However, when I connect zkCli.sh to the second server (90.0.0.222, zkid 251), 
> I saw three znodes under that same path:
> {quote}
> [zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm
> nominee0000000006   nominee0000000010   nominee0000000011
> [zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee0000000011
> 90.0.0.222:7777 
> cZxid = 0x400000027
> ctime = Thu Jan 19 08:18:24 UTC 2012
> mZxid = 0x400000027
> mtime = Thu Jan 19 08:18:24 UTC 2012
> pZxid = 0x400000027
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0xa234f4f3bc220001
> dataLength = 16
> numChildren = 0
> [zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee0000000010
> 90.0.0.221:7777 
> cZxid = 0x30000014c
> ctime = Thu Jan 19 07:53:42 UTC 2012
> mZxid = 0x30000014c
> mtime = Thu Jan 19 07:53:42 UTC 2012
> pZxid = 0x30000014c
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0xa234f4f3bc220000
> dataLength = 16
> numChildren = 0
> [zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee0000000006
> 90.0.0.223:7777 
> cZxid = 0x200000cab
> ctime = Thu Jan 19 08:00:30 UTC 2012
> mZxid = 0x200000cab
> mtime = Thu Jan 19 08:00:30 UTC 2012
> pZxid = 0x200000cab
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x5434f5074e040002
> dataLength = 16
> numChildren = 0
> {quote}
> These never went away for the lifetime of the server, for any clients 
> connected directly to that server.  Note that this cluster is configured to 
> have all three servers still, the third one being down (90.0.0.223, zkid 162).
> I captured the data/snapshot directories for the the two live servers.  When 
> I start single-node servers using each directory, I can briefly see that the 
> inconsistent data is present in those logs, though the ephemeral nodes seem 
> to get (correctly) cleaned up pretty soon after I start the server.
> I will upload a tar containing the debug logs and data directories from the 
> failure.  I think we can reproduce it regularly if you need more info.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart

Reply via email to