[ https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Forest Soup updated SOLR-7069: ------------------------------ Description: When querying a collection with a core in "down" state, if we send the request to the server containing the "down" core, while the server is active, it cannot failover to the good replica of same shard on another server. The steps to make a core "down" on an active server is: 1, delete the content of the data folder of the core 2, restart the solr server the core locates. Then we can see the core is "down" while other cores on the same server is still active. See attached picture. When we issue a query to the collection, if we send the request to the server containing the "down" core, we receive below errors: HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available due to init failure: Error opening new searcher,trace=org.apache.solr.common.SolrException: SolrCore 'collection5_shard1_replica2' is not available due to init failure: Error opening new searcher at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:804) Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at java.util.concurrent.FutureTask.run(FutureTask.java:273) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at java.util.concurrent.FutureTask.run(FutureTask.java:273) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused by: java.io.FileNotFoundException: /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No such file or directory) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233) at org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 more ,code=500} was:When querying a collection with a core in "down" state, if we send the request to the server containing the "down" core, while the server is active, it cannot failover to the good replica of same shard on another server. > A down core(shard replica) on an active node cannot failover the query to its > good peer > --------------------------------------------------------------------------------------- > > Key: SOLR-7069 > URL: https://issues.apache.org/jira/browse/SOLR-7069 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 4.7 > Reporter: Forest Soup > > When querying a collection with a core in "down" state, if we send the > request to the server containing the "down" core, while the server is active, > it cannot failover to the good replica of same shard on another server. > The steps to make a core "down" on an active server is: > 1, delete the content of the data folder of the core > 2, restart the solr server the core locates. > Then we can see the core is "down" while other cores on the same server is > still active. See attached picture. > When we issue a query to the collection, if we send the request to the server > containing the "down" core, we receive below errors: > HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not > available due to init failure: Error opening new > searcher,trace=org.apache.solr.common.SolrException: SolrCore > 'collection5_shard1_replica2' is not available due to init failure: Error > opening new searcher at > org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) > at > org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040) > at > org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) > at > org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) > at > org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.lang.Thread.run(Thread.java:804) Caused by: > org.apache.solr.common.SolrException: Error opening new searcher at > org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at > org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at > org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at > org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at > org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at > org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at > java.util.concurrent.FutureTask.run(FutureTask.java:273) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at > java.util.concurrent.FutureTask.run(FutureTask.java:273) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) > ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening > new searcher at > org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at > org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at > org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused > by: java.io.FileNotFoundException: > /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No > such file or directory) at > java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at > org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at > org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233) > at > org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at > org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at > org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at > org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 > more ,code=500} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org