Any way to recover a corrupt index from a "live" IndexReader?

Trey Wed, 24 Feb 2010 20:55:21 -0800

Hi All,

It seems I have a corrupt index on disk on my Master, but the live
IndexReader is still working.  I don't want to restart Solr (1.4), because
I'm pretty sure the corrupt index will be loaded upon restart, causing me to
delete and rebuild the index from source.  Is there any way to restore the
index to disk from a "live" indexreader?


Here's the behavior I'm seeing in case anyone wants more details:

1. I have an active Master running with 30 Solr Cores.  When I search on the
Master it successfully searches across all 30 cores.  Several of the cores
are replicating properly to the slaves, and several are not.
2. In the Master logs, I see the following:
Feb 24, 2010 4:48:57 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: java.io.IOException: read past EOF
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
--
Feb 24, 2010 4:49:08 AM
org.apache.solr.update.DirectUpdateHandler2$CommitTracke
r run
SEVERE: auto commit error...
Feb 24, 2010 4:49:08 AM org.apache.solr.core.SolrDeletionPolicy onCommit
--
Feb 24, 2010 4:49:18 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException:
org.apache.lucene.index.CorruptIndexExceptio
n: doc counts differ for segment _311: fieldsReader shows 16 but segmentInfo
sho
ws 211
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
--
Feb 24, 2010 4:49:18 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException:
org.apache.lucene.index.CorruptIndexExceptio
n: doc counts differ for segment _311: fieldsReader shows 16 but segmentInfo
sho
ws 211
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)

2. In the failing slaves logs, I'm seeing the following repeated over and
over:

INFO: Skipping download for /opt/solr/cores/28/index/_3mo.fdt  [This line is
repeated for every file in the index]

...

Feb 24, 2010 11:00:30 AM org.apache.solr.handler.SnapPuller fetchLatestIndex

INFO: Total time taken for download : 0 secs

Feb 24, 2010 11:00:30 AM org.apache.solr.update.DirectUpdateHandler2 commit

INFO: start
commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDelete

s=false)

Feb 24, 2010 11:00:30 AM org.apache.solr.handler.ReplicationHandler doFetch

SEVERE: SnapPull failed

org.apache.solr.common.SolrException: Index fetch failed :

        at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:3

29)

        at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler

.java:264)

        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
3. If I stop and restart Solr, I get the following error when I try to hit
the Admin page:
HTTP Status 500 - Severe errors in solr configuration. Check your log files
for more detailed information on what may be wrong. If you want solr to
continue after configuration errors, change:
<abortOnConfigurationError>false</abortOnConfigurationError> in solr.xml
-------------------------------------------------------------
java.lang.RuntimeException: java.io.IOException: read past EOF at
org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068) at
org.apache.solr.core.SolrCore.<init>(SolrCore.java:579) at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:428) at
org.apache.solr.core.CoreContainer.load(CoreContainer.java:278) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at ...

4. If I Delete all of the indexes and restart Solr on a slave, the core
loads up fine (No "Configuration Error" message) and starts trying to
replicate from the Master again.
5. Back to step 1.

Any way to recover a corrupt index from a "live" IndexReader?

Reply via email to