Hi Ben,
my setup is running on docker. The work directory is mounted as a docker volume and that got lost. Just the config was left. Given that all ports and host names did not change I actually did not expect any communication problems. But looking into the logs again as you suggested I actually found that the healthy node could not reach the node that had failed. We actually had an addition problem with the docker host of that machine, which is also why the volume was lost, and it looks like the DNS lookup had a problem. So after I restarted one of the good nodes ZooKeeper recovered now again and all nodes are good again :-)

thanks,
Hendrik

On 25.01.2017 01:34, Ben Sherman wrote:
Do you know why the node lost its data?  Are your configuration files
correct?  Is is trying to join the ensemble?  Are there any mentions of the
broken node trying to reach the good nodes in the good nodes' logs?

On Tue, Jan 24, 2017 at 1:06 PM, Hendrik Haddorp <[email protected]>
wrote:

Hi,

I assume this is quite a standard issue but I failed to find a solution so
far. I have a 3 node ZooKeeper 3.4.6 ensemble and one node lost all its
data. My assumption was that when the node comes up again ZooKeeper would
send over the state from the remaining nodes to reinitialize it but that
does not seem to happen. So what can I do to recover my node without
changing the two left nodes? I tried to copy the snapshots and logs from
one node but that did not work so far.

thanks,
Hendrik


Reply via email to