[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559328#comment-16559328
 ] 

Oded commented on ZOOKEEPER-3036:
---------------------------------

Hi,

We did the same but the issue returned again. It happens to us from time to
time, and we handle it manually.
The main problem is that it caused kafka for "split brain" where the
cluster believes it has more then one controller.





> Unexpected exception in zookeeper
> ---------------------------------
>
>                 Key: ZOOKEEPER-3036
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: jmx
>    Affects Versions: 3.4.10
>         Environment: 3 Zookeepers, 5 kafka servers
>            Reporter: Oded
>            Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>         at java.net.SocketInputStream.read(SocketInputStream.java:171)
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>         at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - ******* GOODBYE 
> /192.168.0.91:42490 ********
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to