[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756224#comment-16756224
 ] 

Hudson commented on ZOOKEEPER-3240:
-----------------------------------

SUCCESS: Integrated in Jenkins build Zookeeper-trunk-single-thread #211 (See 
[https://builds.apache.org/job/Zookeeper-trunk-single-thread/211/])
Revert "ZOOKEEPER-3240: Close socket on Learner shutdown to avoid (andor: rev 
bcbf64884f2ee3e8a150b0b3c20a8fa03a05162e)
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Follower.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Observer.java
* (edit) 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java


> Close socket on Learner shutdown to avoid dangling socket
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-3240
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3240
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.6.0
>            Reporter: Brian Nixon
>            Assignee: Brian Nixon
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.6.0, 3.5.5
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There was a Learner that had two connections to the Leader after that Learner 
> hit an unexpected exception during flush txn to disk, which will shutdown 
> previous follower instance and restart a new one.
>  
> {quote}2018-10-26 02:31:35,568 ERROR 
> [SyncThread:3:ZooKeeperCriticalThread@48] - Severe unrecoverable error, from 
> thread : SyncThread:3
> java.io.IOException: Input/output error
>         at java.base/sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>         at 
> java.base/sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:72)
>         at 
> java.base/sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:395)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:457)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:548)
>         at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:769)
>         at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:246)
>         at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:172)
> 2018-10-26 02:31:35,568 INFO  [SyncThread:3:ZooKeeperServerListenerImpl@42] - 
> Thread SyncThread:3 exits, error code 1
> 2018-10-26 02:31:35,568 INFO [SyncThread:3:SyncRequestProcessor@234] - 
> SyncRequestProcessor exited!{quote}
>  
> It is supposed to close the previous socket, but it doesn't seem to be done 
> anywhere in the code. This leaves the socket open with no one reading from 
> it, and caused the queue full and blocked on sender.
>  
> Since the LearnerHandler didn't shutdown gracefully, the learner queue size 
> keeps growing, the JVM heap size on leader keeps growing and added pressure 
> to the GC, and cause high GC time and latency in the quorum.
>  
> The simple fix is to gracefully shutdown the socket.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to