Please create a jira for this, attach the logs, config, etc... Thanks.

Patrick

On Thu, Apr 26, 2012 at 11:17 AM, Neha Narkhede <neha.narkh...@gmail.com> wrote:
> Hi,
>
> This week, we saw a potential bug with zookeeper 3.3.4. In an attempt to
> adding a separate disk for zookeeper transaction logs, our SysOps team
> threw new disks at all the zookeeper servers in our production cluster at
> around the same time. Right after this, we saw degraded performance on our
> zookeeper cluster. And yes, I agree that this degraded behavior is expected
> and we could've done a better job and upgraded one server at a time. Al
> though, the observed impact was that ephemeral nodes got deleted without
> session expiration on the zookeeper clients.
>
> Let me try and describe what I've observed from the Kafka and ZK server
> logs -
>
> Kafka client has a session established with ZK, say Session A, that it has
> been using successfully. At the time of the degraded ZK performance issue,
> Session A expires. Kafka's ZkClient tries to establish another session with
> ZK. After 9 seconds, it establishes a session, say Session B and tries to
> use it for creating a znode. This operation fails with a NodeExists error
> since another session, say session C, has created that znode. This is
> considered OK since ZkClient retries an operation transparently if it gets
>  disconnected and sometimes you can get NodeExists. But then later, session
> C expires and hence the ephemeral node is deleted from ZK. This leads to
> unexpected errors in Kafka since its session, Session B, is still valid and
> hence it expects the znode to be there. The issue is that session C was
> established, created the znode and expired, without the zookeeper client on
> Kafka ever knowing about it.
>
> I've attempted to serialize the sequence of events according to relevant
> data from txn and log4j logs below.
>
> Leader zookeeper is srvr-90.prod
>
> Session A = 0x9367a5bd54c9078
> Session B = 0x8367a5bd75ea01b
> Session C = 0x8367a5bd75e9dd5
> The znode being created =
>  /consumers/kafka-DR/ids/kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76
> zookeeper session timeout is 6 seconds
>
>
>   1.
>
>   Original session A, 0x9367a5bd54c9078 expired at 20:33:09 with zxid
>   0x31189e02d8 on the leader.
>   2.
>
>   Leader processed session termination for 0x9367a5bd54c9078 at
>   20:33.09.000.
>   3.
>
>   Leader realizes client has closed socket, so it closes NIOServerCnxn for
>   0x9367a5bd54c9078 at 20:33:09.009
>   4.
>
>   Kafka client gets Expired state at 20:33:09.161
>   5.
>
>   Kafka client starts re-registering its node in ZK at 20:33.09.209 with
>   socket connection established to srvr-89.prod
>   6.
>
>   Client session 0x0 times out several times, reconnects to another server
>   and repeats session establishment request.
>   7.
>
>   On the leader's txn log, session B 0x8367a5bd75ea01b is created at
>   20:33.17-
>   1.
>
>      time:4/23/12 20:33:17 PM PDT session:0x8367a5bd75ea01b cxid:0x0
>      zxid:0x31189e1985 type:createSession timeOut:6000
>       8.
>
>   On zk server srvr-89, session B 0x8367a5bd75ea01b is established -
>   1.
>
>      zk.log4j.2012-04-23-srvr-89.gz:
>      1.
>
>         2012-04-23 20:33:17,987 - INFO
>          [CommitProcessor:8:NIOServerCnxn@1580] - Established session
>         0x8367a5bd75ea01b with negotiated timeout 6000 for client /
>         172.17.87.173:52865
>         2.
>
>         2012/04/23 20:33:18.022 INFO [ZkClient] [main-EventThread] [kafka]
>         zookeeper state changed (SyncConnected)
>          9.
>
>   At 20:33:18.022 on Kafka client, session is established with session id
>   0x8367a5bd75ea01b
>   1.
>
>      2012/04/23 20:33:18.022 INFO [ClientCnxn]
>      [main-SendThread(srvr-89.prod:12913)] [kafka] Session establishment
>      complete on server srvr-89.prod/172.17.229.105:12913, sessionid =
>      0x8367a5bd75ea01b, negotiated timeout = 6000
>       10.
>
>   Kafka zookeeper client fails to create znode at 20:33.18.358 with
>   Session B due to NodeExists exception
>   1.
>
>      2012/04/23 20:33:18.358 INFO [ZkUtils$]
>      
> [ZkClient-EventThread-81-srvr-85.prod:12913,srvr-88.prod:12913,srvr-84.prod:12913,srvr-89.prod:12913,srvr-90.prod:12913/kafka]
>      [kafka]
>      /consumers/kafka-DR/ids/kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76
>      exists with value { "firehoseActivity": 1,"firehoseUpdates":
>      1,"GenericSearchEvent": 1,"firehoseShares": 1,"ProfileViewEvent": 1 }
>      during connection loss; this is ok
>       11.
>
>   Kafka client gets Disconnected and SyncConnected with session B at
>   20:33:26
>   1.
>
>      2012/04/23 20:33:26.669 INFO [ClientCnxn]
>      [main-SendThread(srvr-89.prod:12913)] [kafka] Client session timed out,
>      have not heard from server in 4000ms for sessionid 0x8367a5bd75ea01b,
>      closing socket connection and attempting reconnect
>      2.
>
>      2012/04/23 20:33:26.769 INFO [ZkClient] [main-EventThread] [kafka]
>      zookeeper state changed (Disconnected)
>      3.
>
>      2012/04/23 20:33:27.471 INFO [ClientCnxn]
>      [main-SendThread(srvr-89.prod:12913)] [kafka] Opening socket
> connection to
>      server srvr-88.prod/172.17.229.104:12913
>      4.
>
>      2012/04/23 20:33:27.531 INFO [ClientCnxn]
>      [main-SendThread(srvr-88.prod:12913)] [kafka] Socket connection
> established
>      to srvr-88.prod/172.17.229.104:12913, initiating session
>      5.
>
>      2012/04/23 20:33:28.419 INFO [ClientCnxn]
>      [main-SendThread(srvr-88.prod:12913)] [kafka] Session establishment
>      complete on server srvr-88.prod/172.17.229.104:12913, sessionid =
>      0x8367a5bd75ea01b, negotiated timeout = 6000
>       12.
>
>   Kafka client runs into NoNode exception at 20:33:30
>   2012/04/23 20:33:30.538 INFO [ZookeeperConsumerConnector]
>   [kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76_watcher_executor] [kafka]
>   exception during rebalance
>
> At around the same time, the session C (0x8367a5bd75e9dd5) gets established
> and also creates the node and later expires.
>
>   1.
>
>   On leader srvr-90, session C gets created
>   1.
>
>      time:4/23/12 20:33:09 PM PDT session:0x8367a5bd75e9dd5 cxid:0x0
>      zxid:0x31189e0709 type:createSession timeOut:6000
>       2.
>
>   On server srvr-89, session gets created at 20:33.13.930
>   1.
>
>      zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:13,930 - INFO
>       [CommitProcessor:8:NIOServerCnxn@1580] - Established session
>      0x8367a5bd75e9dd5 with negotiated timeout 6000 for client /
>      172.17.87.173:52802
>      2.
>
>      zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:17,667 - WARN
>       [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@634] -
>      EndOfStreamException: Unable to read additional data from client
> sessionid
>      0x8367a5bd75e9dd5, likely client has closed socket
>      3.
>
>      zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:17,668 - INFO
>       [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] -
>      Closed socket connection for client /172.17.87.173:52802 which had
>      sessionid 0x8367a5bd75e9dd5
>       3.
>
>   3. Session B creates the znode -
>   1.
>
>      time:4/23/12 20:33:17 PM PDT session:0x8367a5bd75e9dd5 cxid:0x1
>      zxid:0x31189e1464 type:create acl:[31,s{'world,'anyone}] ephemeral:true
>      
> path:/kafka/consumers/kafka-DR/ids/kafka-DR_ech3-app0193.prod-1334094226781-6a7f1b76
>      
> data:7B202266697265686F73654163746976697479223A20312C2266697265686F736555706461746573223A20312C2247656E657269635365617263684576656E74223A20312C2266697265686F7365536861726573223A20312C2250726F66696C65566965774576656E74223A2031207D
>
>      2.
>
>      Session B gets closed on the leader
>       4.
>
>   Session C expires on leader
>   1.
>
>      time:4/23/12 20:33:29 PM PDT session:0x8367a5bd75e9dd5 cxid:0x0
>      zxid:0x31189e27d0 type:closeSession
>
>
> Since the leader processes create session and create znode for Session C
> first, shouldn't it be the session id that gets returned to the client as
> create session response ?
> Does this sound like a bug ?
>
> *TL;DR*
>
> Ephemeral nodes got deleted for unexpired sessions on ZK 3.3.4.
> The kafka and zookeeper server log4j logs (timestamps are in UTC) as well
> as the leader's txn log are
> here.<http://people.apache.org/%7Enehanarkhede/kafka-misc/zookeeper-outage-2012-04-23/kafka-zk-bug-2012-04-23.tar.gz>
>
> Will appreciate any help with debugging this.
>
> Thanks,
> Neha

Reply via email to