[ 
https://issues.apache.org/jira/browse/SOLR-9811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706632#comment-15706632
 ] 

Scott Blum edited comment on SOLR-9811 at 11/29/16 9:46 PM:
------------------------------------------------------------

I've had cases where a single replica ended up stuck in the DOWN state, but all 
the other replicas on that machine were fine, and the replica itself was 
physically present and able to serve queries.  The only 'fix' in that situation 
is to reboot the node, but then you're interrupting service for collections 
that have no problems.

In those cases I've manually shoved a state update operation into ZK like this.

{code:title=From zk-shell}
create /solr/overseer/queue/qn- 
'{"core":"FOO_shard1_replica1","core_node_name":"core_node1","base_url":"http://1.1.1.1:8983/solr","node_name":"1.1.1.1:8983_solr","state":"active","shard":"shard1","collection":"FOO","operation":"state"}'
 false true
{code}

The correct values can be filled in from the result of querying CLUSTERSTATE on 
the affected collection.  Would be nice to have a tool for this.


was (Author: dragonsinth):
I've had cases where a single replica ended up stuck in the DOWN state, but all 
the other replicas on that machine were fine, and the replica itself was 
physically present and able to serve queries.  The only 'fix' in that situation 
is to reboot the node, but then you're interrupting service for collections 
that have no problems.

In those cases I've manually shoved a state update operation into ZK like this.

{code:title=From zk-shell}
create /solr/overseer/queue/qn- 
'\{"core":"FOO_shard1_replica1","core_node_name":"core_node1","base_url":"http://1.1.1.1:8983/solr","node_name":"1.1.1.1:8983_solr","state":"active","shard":"shard1","collection":"FOO","operation":"state"}'
 false true
{code}

The correct values can be filled in from the result of querying CLUSTERSTATE on 
the affected collection.  Would be nice to have a tool for this.

> Make it easier to manually execute overseer commands
> ----------------------------------------------------
>
>                 Key: SOLR-9811
>                 URL: https://issues.apache.org/jira/browse/SOLR-9811
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Mike Drob
>
> Sometimes solrcloud will get into a bad state w.r.t. election or recovery and 
> it would be useful to have the ability to manually publish a node as active 
> or leader. This would be an alternative to some current ops practices of 
> restarting services, which may take a while to complete given many cores 
> hosted on a single server.
> This is an expert operator technique and readers should be made aware of 
> this, a.k.a. the "I don't care, just get it running" approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to