[ https://issues.apache.org/jira/browse/SLIDER-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15467761#comment-15467761 ]
Josh Elser commented on SLIDER-1166: ------------------------------------ I'm not sure if the ZooKeeper heartbeat thread is resilient to the underlying connection being terminated. One thing you could try is running {{kill -SIGSTOP <pid>}}, wait for a minute, and then run {{kill -SIGCONT <pid>}}. This would suspend the entire process, preventing the ZK heartbeat thread from running (simulating a stop-the-world GC cycle, actually). Another thing I thought of is the Exceptions that ZK might throw which need to be handled by the application. Here's a good example from Accumulo: https://github.com/apache/accumulo/blob/5cb5b9372103761c829403c03007b9f53241400f/fate/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReader.java#L70-L87. > Every cluster stop operation creates and holds on to a zk session > ----------------------------------------------------------------- > > Key: SLIDER-1166 > URL: https://issues.apache.org/jira/browse/SLIDER-1166 > Project: Slider > Issue Type: Bug > Components: client > Affects Versions: Slider 0.91 > Reporter: Billie Rinaldi > Assignee: Billie Rinaldi > Priority: Critical > Fix For: Slider 1.0.0 > > Attachments: SLIDER-1166.1.patch, SLIDER-1166.2.patch > > > We aren't closing the zookeeper client, leaving it to expire and keep the > session open much longer than is necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)