Re: What is the "right" way to bring a failed SolrCloud node back online?

2014-01-26 Thread Nathan Neulinger
Thanks, yeah, I did just that - and sent the script in on SOLR-5665 if anyone wants a copy. Script is trivial, but 
you're welcome to stick it (trivial) in contrib or something if it's at all useful to anyone.


-- Nathan

On 01/26/2014 08:28 AM, Mark Miller wrote:

We are working on a new mode (which should become the default) where ZooKeeper 
will be treated as the truth for a cluster.

This mode will be able to handle situations like this - if the cluster state 
says a core should exist on a node and it doesn’t, it will be created on 
startup.

The way things work currently is this kind of hybrid situation where the truth 
is partly in ZooKeeper partly on each node. This is not ideal at all.

I think this new mode is very important, and it will be coming shortly. Until 
then, I’d recommend writing this logic externally as you suggest (I’ve seen it 
done before).

- Mark

http://about.me/markrmiller

On Jan 24, 2014, at 12:01 PM, Nathan Neulinger  wrote:


I have an environment where new collections are being added frequently 
(isolated per customer), and the backup is virtually guaranteed to be missing 
some of them.

As it stands, bringing up the restored/out-of-date instance results in thos 
collections being stuck in 'Recovering' state, because the cores don't exist on 
the resulting server. This can also be extended to the case of restoring a 
completely blank instance.

Is there any way to tell SolrCloud "Try recreating any missing cores for this 
collection based on where you know they should be located."

Or do I need to actually determine a list of cores (..._shardX_replicaY) and 
trigger the core creates myself, at which point I gather that it will start 
recovery for each of them?

-- Nathan


Nathan Neulinger   nn...@neulinger.org
Neulinger Consulting   (573) 612-1412




--

Nathan Neulinger   nn...@neulinger.org
Neulinger Consulting   (573) 612-1412


Re: What is the "right" way to bring a failed SolrCloud node back online?

2014-01-26 Thread Mark Miller
We are working on a new mode (which should become the default) where ZooKeeper 
will be treated as the truth for a cluster.

This mode will be able to handle situations like this - if the cluster state 
says a core should exist on a node and it doesn’t, it will be created on 
startup.

The way things work currently is this kind of hybrid situation where the truth 
is partly in ZooKeeper partly on each node. This is not ideal at all.

I think this new mode is very important, and it will be coming shortly. Until 
then, I’d recommend writing this logic externally as you suggest (I’ve seen it 
done before).

- Mark

http://about.me/markrmiller

On Jan 24, 2014, at 12:01 PM, Nathan Neulinger  wrote:

> I have an environment where new collections are being added frequently 
> (isolated per customer), and the backup is virtually guaranteed to be missing 
> some of them.
> 
> As it stands, bringing up the restored/out-of-date instance results in thos 
> collections being stuck in 'Recovering' state, because the cores don't exist 
> on the resulting server. This can also be extended to the case of restoring a 
> completely blank instance.
> 
> Is there any way to tell SolrCloud "Try recreating any missing cores for this 
> collection based on where you know they should be located."
> 
> Or do I need to actually determine a list of cores (..._shardX_replicaY) and 
> trigger the core creates myself, at which point I gather that it will start 
> recovery for each of them?
> 
> -- Nathan
> 
> 
> Nathan Neulinger   nn...@neulinger.org
> Neulinger Consulting   (573) 612-1412



What is the "right" way to bring a failed SolrCloud node back online?

2014-01-24 Thread Nathan Neulinger
I have an environment where new collections are being added frequently (isolated per customer), and the backup is 
virtually guaranteed to be missing some of them.


As it stands, bringing up the restored/out-of-date instance results in thos collections being stuck in 'Recovering' 
state, because the cores don't exist on the resulting server. This can also be extended to the case of restoring a 
completely blank instance.


Is there any way to tell SolrCloud "Try recreating any missing cores for this collection based on where you know they 
should be located."


Or do I need to actually determine a list of cores (..._shardX_replicaY) and trigger the core creates myself, at which 
point I gather that it will start recovery for each of them?


-- Nathan


Nathan Neulinger   nn...@neulinger.org
Neulinger Consulting   (573) 612-1412