Re: recover failed node

Hendrik Haddorp Tue, 24 Jan 2017 22:48:58 -0800

Hi Ben,

my setup is running on docker. The work directory is mounted as a dockervolume and that got lost. Just the config was left. Given that all portsand host names did not change I actually did not expect anycommunication problems. But looking into the logs again as you suggestedI actually found that the healthy node could not reach the node that hadfailed. We actually had an addition problem with the docker host of thatmachine, which is also why the volume was lost, and it looks like theDNS lookup had a problem. So after I restarted one of the good nodesZooKeeper recovered now again and all nodes are good again :-)


thanks,
Hendrik

On 25.01.2017 01:34, Ben Sherman wrote:

Do you know why the node lost its data?  Are your configuration files
correct?  Is is trying to join the ensemble?  Are there any mentions of the
broken node trying to reach the good nodes in the good nodes' logs?

On Tue, Jan 24, 2017 at 1:06 PM, Hendrik Haddorp <[email protected]>
wrote:

Hi,

I assume this is quite a standard issue but I failed to find a solution so
far. I have a 3 node ZooKeeper 3.4.6 ensemble and one node lost all its
data. My assumption was that when the node comes up again ZooKeeper would
send over the state from the remaining nodes to reinitialize it but that
does not seem to happen. So what can I do to recover my node without
changing the two left nodes? I tried to copy the snapshots and logs from
one node but that did not work so far.

thanks,
Hendrik

Re: recover failed node

Reply via email to