Hi All, It seems I have a corrupt index on disk on my Master, but the live IndexReader is still working. I don't want to restart Solr (1.4), because I'm pretty sure the corrupt index will be loaded upon restart, causing me to delete and rebuild the index from source. Is there any way to restore the index to disk from a "live" indexreader?
Here's the behavior I'm seeing in case anyone wants more details: 1. I have an active Master running with 30 Solr Cores. When I search on the Master it successfully searches across all 30 cores. Several of the cores are replicating properly to the slaves, and several are not. 2. In the Master logs, I see the following: Feb 24, 2010 4:48:57 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: java.io.IOException: read past EOF at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068) -- Feb 24, 2010 4:49:08 AM org.apache.solr.update.DirectUpdateHandler2$CommitTracke r run SEVERE: auto commit error... Feb 24, 2010 4:49:08 AM org.apache.solr.core.SolrDeletionPolicy onCommit -- Feb 24, 2010 4:49:18 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexExceptio n: doc counts differ for segment _311: fieldsReader shows 16 but segmentInfo sho ws 211 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068) -- Feb 24, 2010 4:49:18 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexExceptio n: doc counts differ for segment _311: fieldsReader shows 16 but segmentInfo sho ws 211 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068) 2. In the failing slaves logs, I'm seeing the following repeated over and over: INFO: Skipping download for /opt/solr/cores/28/index/_3mo.fdt [This line is repeated for every file in the index] ... Feb 24, 2010 11:00:30 AM org.apache.solr.handler.SnapPuller fetchLatestIndex INFO: Total time taken for download : 0 secs Feb 24, 2010 11:00:30 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDelete s=false) Feb 24, 2010 11:00:30 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:3 29) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler .java:264) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) 3. If I stop and restart Solr, I get the following error when I try to hit the Admin page: HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: <abortOnConfigurationError>false</abortOnConfigurationError> in solr.xml ------------------------------------------------------------- java.lang.RuntimeException: java.io.IOException: read past EOF at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:579) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:428) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:278) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at ... 4. If I Delete all of the indexes and restart Solr on a slave, the core loads up fine (No "Configuration Error" message) and starts trying to replicate from the Master again. 5. Back to step 1.