Re: Meet CorruptIndexException while shutdown one node in Solr cloud
Hi Erick, Thanks for your advice about having openSearcher set to true unnecessary for my case. For CorruptIndexException issue, I think Solr should handle this quite well too. Because I always shutdown tomcat gracefully. Recently I did a couple of tests about this issue. When keep posting update request to Solr and stop one of three tomcat node in a single shard cluster, it is easy to reproduction CorruptIndexException, no matter the stop node is leader node or replica node. So I think this is a Bug of Solr. Any idea how can I avoid meeting this issue? For example if I can remove one node from zookeeper before stop it. Also please show me if reboot tomcat node is the only way to resolve the memory issue. If I can control the field cache size, then reboot is unnecessary. Below is the trace when start tomcat and first time meet CorruptIndexException issue: 2017-09-19 10:18:57,614 ERROR [RecoveryThread][RQ-Init] (SolrException.java:142) - SnapPull failed :org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677) at org.apache.solr.handler.SnapPuller.openNewSearcherAndUpdateCommitPoint(SnapPuller.java:673) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:493) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:337) at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:163) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:447) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) Caused by: org.apache.lucene.index.CorruptIndexException: liveDocs.count()=10309577 info.docCount=15057819 info.getDelCount()=4748252 (filename=_4y65a_13g.del) at org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:96) at org.apache.lucene.index.SegmentReader.(SegmentReader.java:116) at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144) at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:238) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:104) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:422) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:279) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476) ... 7 more Regards. Geng, Wei -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Meet CorruptIndexException while shutdown one node in Solr cloud
bq: This means Solr may get update request during shutdown. I think that is the reason we get CorruptIndexException. This is unlikely, Solr should handle this quite well. More likely you encountered some other issue, one possibility is that you had a disk full situation and that was the root of your issue. I'll add as an aside that having openSearcher set to true in your autoCommit setting _and_ setting autoSoftCommit is unnecessary, choose one or the other. See: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Fri, Sep 15, 2017 at 3:55 AM, wg85907wrote: > Hi team, > Currently I am using Solr 4.10 in tomcat. I have a one shard Solr > Cloud with 3 replicas. I set heap size to 15GB for each node. As I have big > data volume and large amount of query request. So always meet frequent full > GC issue. We have checked this and found that many memory was used as field > cache by Solr. To avoid this, we begin to reboot tomcat instance one by one > in schedule. We don't kill any process but run script "catalina.sh stop" to > shutdown tomcat gracefully. To keep message not pending, we receive message > from user all the time and send update request to Solr once get new message. > This means Solr may get update request during shutdown. I think that is the > reason we get CorruptIndexException. Since we begin to do the reboot, we > always get CorruptIndexException. The trace is as below: > 2017-09-14 04:25:49,241 > ERROR[commitScheduler-15-thread-1][R31609](CommitTracker) - auto commit > error...:org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:607) > at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.lucene.index.CorruptIndexException: > liveDocs.count()=33574 info.docCount=34156 info.getDelCount()=584 > (filename=_1uvck_k.del) > at > org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:96) > at > org.apache.lucene.index.SegmentReader.(SegmentReader.java:116) > at > org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144) > at > org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:282) > at > org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271) > at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262) > at > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421) > at > org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:279) > at > org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476) > ... 10 more > > > As we shutdown Solr gracefully, I think Solr should be strong enough > to handle this case. Please give me some advice about why this happen and > what we can do to avoid this. Ps below is some of our solrConfig cotent: > > > 6 > true > > > 1000 > > > Regards, > Geng, Wei > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Meet CorruptIndexException while shutdown one node in Solr cloud
Hi team, Currently I am using Solr 4.10 in tomcat. I have a one shard Solr Cloud with 3 replicas. I set heap size to 15GB for each node. As I have big data volume and large amount of query request. So always meet frequent full GC issue. We have checked this and found that many memory was used as field cache by Solr. To avoid this, we begin to reboot tomcat instance one by one in schedule. We don't kill any process but run script "catalina.sh stop" to shutdown tomcat gracefully. To keep message not pending, we receive message from user all the time and send update request to Solr once get new message. This means Solr may get update request during shutdown. I think that is the reason we get CorruptIndexException. Since we begin to do the reboot, we always get CorruptIndexException. The trace is as below: 2017-09-14 04:25:49,241 ERROR[commitScheduler-15-thread-1][R31609](CommitTracker) - auto commit error...:org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:607) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.lucene.index.CorruptIndexException: liveDocs.count()=33574 info.docCount=34156 info.getDelCount()=584 (filename=_1uvck_k.del) at org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:96) at org.apache.lucene.index.SegmentReader.(SegmentReader.java:116) at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144) at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:282) at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:279) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476) ... 10 more As we shutdown Solr gracefully, I think Solr should be strong enough to handle this case. Please give me some advice about why this happen and what we can do to avoid this. Ps below is some of our solrConfig cotent: 6 true 1000 Regards, Geng, Wei -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html