Have you tried to stop the node, delete the data and log directory, upgrade to 3.5.5 , start the node and wait until it is synchronized ?
> Am 02.10.2019 um 20:14 schrieb Jerry Hebert <[email protected]>: > > Hi all, > > My first post here! I'm hoping you all might be able to offer some guidance > or redirect me to an existing ticket. We have a five node ensemble on > 3.4.11 that we're currently in the process of upgrading to 3.5.5. We > recently saw some bizarre behavior in our ensemble that I was hoping to > find some sort pre-existing ticket or discussion about but I was having > difficulty finding hits for this in Jira. > > The behavior that we saw from our metrics is that one of our nodes (not > sure if it was a follower or a leader) started to demonstrate > instability (high CPU, high RAM) and it crashed. Not a big deal, but as > soon as it crashed, all of the other four nodes all immediately restarted, > resulting in a short outage. One node crashing should never cause an > ensemble restart of course, so I assumed that this must be a bug in ZK. The > nodes that restarted had no indication of errors in their logs, they just > simply restarted. Does this sound familiar to any of you? > > Also, we are using Exhibitor on that ensemble so it's also possible that > the restart was caused by Exhibitor. > > My hope is that this issue will be behind us once the 3.5.5 upgrade is > complete but I'd ideally like to find some concrete evidence of this. > > Thanks! > Jerry
