[ 
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7069:
------------------------------
    Description: 
When querying a collection with a core in "down" state, if we send the request 
to the server containing the "down" core, while the server is active, it cannot 
failover to the good replica of same shard on another server.

The steps to make a core "down" on an active server is:
1, delete the content of the data folder of the core
2, restart the solr server the core locates.
Then we can see the core is "down" while other cores on the same server is 
still active. See attached picture.

When we issue a query to the collection, if we send the request to the server 
containing the "down" core, we receive below errors:
HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available 
due to init failure: Error opening new 
searcher,trace=org.apache.solr.common.SolrException: SolrCore 
'collection5_shard1_replica2' is not available due to init failure: Error 
opening new searcher at 
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) 
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
 at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
 at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) 
at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
 at java.lang.Thread.run(Thread.java:804) Caused by: 
org.apache.solr.common.SolrException: Error opening new searcher at 
org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at 
org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at 
org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at 
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at 
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at 
java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at 
java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) 
... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) 
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at 
org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused by: 
java.io.FileNotFoundException: 
/mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No 
such file or directory) at 
java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at 
org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
 at 
org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
 at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at 
org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
 at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
 at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at 
org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at 
org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at 
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at 
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
 at 
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
 at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 
more ,code=500}

  was:When querying a collection with a core in "down" state, if we send the 
request to the server containing the "down" core, while the server is active, 
it cannot failover to the good replica of same shard on another server.


> A down core(shard replica) on an active node cannot failover the query to its 
> good peer
> ---------------------------------------------------------------------------------------
>
>                 Key: SOLR-7069
>                 URL: https://issues.apache.org/jira/browse/SOLR-7069
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.7
>            Reporter: Forest Soup
>
> When querying a collection with a core in "down" state, if we send the 
> request to the server containing the "down" core, while the server is active, 
> it cannot failover to the good replica of same shard on another server.
> The steps to make a core "down" on an active server is:
> 1, delete the content of the data folder of the core
> 2, restart the solr server the core locates.
> Then we can see the core is "down" while other cores on the same server is 
> still active. See attached picture.
> When we issue a query to the collection, if we send the request to the server 
> containing the "down" core, we receive below errors:
> HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not 
> available due to init failure: Error opening new 
> searcher,trace=org.apache.solr.common.SolrException: SolrCore 
> 'collection5_shard1_replica2' is not available due to init failure: Error 
> opening new searcher at 
> org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
>  at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
>  at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
> at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) 
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) 
> at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
>  at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
>  at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
>  at 
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>  at java.lang.Thread.run(Thread.java:804) Caused by: 
> org.apache.solr.common.SolrException: Error opening new searcher at 
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at 
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at 
> org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at 
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at 
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at 
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
>  ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening 
> new searcher at 
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at 
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at 
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused 
> by: java.io.FileNotFoundException: 
> /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No 
> such file or directory) at 
> java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at 
> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at 
> org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
>  at 
> org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
>  at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at 
> org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
>  at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
>  at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at 
> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at 
> org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at 
> org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at 
> org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
>  at 
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
>  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 
> more ,code=500}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to