Process for restoring failed node in a cluster?

David Alan Hjelle Mon, 30 Apr 2018 07:25:21 -0700

(Feel free to point me to resources to answer this question, but I haven’t seen 
a definitive answer yet.)


What’s the recommended process for restoring a failed node in a cluster?

It appears that the process would be:

Rebuild a node with the same configuration and node name as the failed node, 
and bring it up on the network.
Since the node isn’t automatically recognized, go to an *existing* node and 
first GET the revision of the `:5986/_nodes/[email protected] 
<mailto:5986/_nodes/[email protected]>` document, and then PUT a new version of 
that document, such as `curl -X PUT 
"http://admin:[email protected]:5986/_nodes/[email protected]?rev=1-967a00dff5e02add41819138abb3284d
 
<http://admin:[email protected]:5986/_nodes/[email protected]?rev=1-967a00dff5e02add41819138abb3284d>"
 -d {}`.
While the documents are synced, the views are per-node, so they need to be 
manually refreshed per-database.

Does that cover it? If one is running a 3-node cluster with n=3, could I avoid 
step 3 by copying over the `shards` and `.shards` directories from another 
node, since all nodes have identical copies of the data?

Thanks for your help!

David Alan Hjelle

Process for restoring failed node in a cluster?

Reply via email to