Jack Krupansky created SOLR-5014:
------------------------------------
Summary: The "no servers hosting shard" exception will never tell
you the shard
Key: SOLR-5014
URL: https://issues.apache.org/jira/browse/SOLR-5014
Project: Solr
Issue Type: Bug
Components: SolrCloud
Affects Versions: 4.3
Reporter: Jack Krupansky
If none of the replicas for a shard are available, the "no servers hosting
shard" exception is thrown. The code suggests that it will tell you which shard
is available, but it does not. In fact, it can never do so.
The reason is that the code at that point doesn't actually know the shard name
or ID. The "shard" string passed to HttpShardHandler#submit method is "the |
delimited list of equivalent servers". But... that list, by definition, is
empty if all the servers are down for a shard.
What I want to see is the actual shard name/ID, like "shard2".
Here's the shard.info section from a response where the replicas for a second
shard of a two-shard cluster have all been intentionally killed.
{code}
"shards.info":{
"":{
"error":"org.apache.solr.common.SolrException: no servers hosting shard:
",
"trace":"org.apache.solr.common.SolrException: no servers hosting shard:
\r\n\tat
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)\r\n\tat
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)\r\n\tat
java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)\r\n\tat
java.util.concurrent.FutureTask.run(Unknown Source)\r\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\r\n\tat
java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)\r\n\tat
java.util.concurrent.FutureTask.run(Unknown Source)\r\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\r\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\r\n\tat
java.lang.Thread.run(Unknown Source)\r\n",
"time":0},
"207.237.114.232:8983/solr/collection1/|207.237.114.232:8985/solr/collection1/":{
"numFound":8,
"maxScore":1.0,
"time":98}},
{code}
The request:
{code}
curl
"http://localhost:8983/solr/select/?q=*:*&indent=true&wt=json&shards.info=yes&shards.tolerant=yes"
{code}
The shards.info section for the same request after the two killed replicas are
restarted:
{code}
"shards.info":{
"207.237.114.232:8984/solr/collection1/|207.237.114.232:8986/solr/collection1/":{
"numFound":2,
"maxScore":1.0,
"time":224},
"207.237.114.232:8983/solr/collection1/|207.237.114.232:8985/solr/collection1/":{
"numFound":8,
"maxScore":1.0,
"time":898}},
{code}
Here's the cluster state for that cluster when all four nodes are up:
{code}
{"collection1":{
"shards":{
"shard1":{
"range":"80000000-ffffffff",
"state":"active",
"replicas":{
"core_node1":{
"state":"active",
"core":"collection1",
"node_name":"207.237.114.232:8983_solr",
"base_url":"http://207.237.114.232:8983/solr",
"leader":"true"},
"core_node3":{
"state":"active",
"core":"collection1",
"node_name":"207.237.114.232:8985_solr",
"base_url":"http://207.237.114.232:8985/solr"}}},
"shard2":{
"range":"0-7fffffff",
"state":"active",
"replicas":{
"core_node2":{
"state":"active",
"core":"collection1",
"node_name":"207.237.114.232:8984_solr",
"base_url":"http://207.237.114.232:8984/solr",
"leader":"true"},
"core_node4":{
"state":"active",
"core":"collection1",
"node_name":"207.237.114.232:8986_solr",
"base_url":"http://207.237.114.232:8986/solr"}}}},
"router":"compositeId"}}
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]