Re: Corrupted Indexes Under Lucene 2.3 (and 2.3.1)

Michael McCandless Fri, 29 Feb 2008 14:47:21 -0800


Not good!  (I'm sorry).


That first exception is worrisome.  It's the root cause here.

Can you describe your documents? That exception, if I'm reading itright, seems to imply that you have documents with 4762 fields. Isthat right?

Are you using multiple threads? Is it possible that you caladdDocument with a given document, but another thread removing fieldsfrom that document (or, removing elements from the List returned byDocument.getFields())? That's the only way I can explain that firstexception.


Mike

Tyler V wrote:

After upgrading to Lucene 2.3 (and subsequently 2.3.1), our
application has experienced sporadic index corruptions on our larger
(and more frequently updated) indexes. These indexes experienced no
corruptions under any prior version of Lucene (which we have been
using for several years).

The pattern of failure seems to be consistent.  First, we receive an
exception like the following:

java.lang.IndexOutOfBoundsException: Index: 4788, Size: 4762
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.DocumentsWriter$ThreadState.init(DocumentsWriter.java:749)at org.apache.lucene.index.DocumentsWriter.getThreadState(DocumentsWriter.java:2391)at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2434)at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2422)at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445)at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424)at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:134)
        at java.lang.Thread.run(Thread.java:619)
When we experience this error, we run a writer.flush() then awriter.close().
Then, we get this exception when trying to re-open the index:

org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _c2z13: fieldsReader shows 2 but segmentInfo shows 3
at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197)at org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55)at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75)at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)at org.apache.lucene.index.IndexReader.open(IndexReader.java:192)at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:107)
        at java.lang.Thread.run(Thread.java:619)

Running the check index application included with 2.3 enables us to
remove the bad documents from the index, but this workaround is less
than desirable.  It would be greatly appreciated if anyone could shed
some light on our issue.

Regards,

Tyler

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Corrupted Indexes Under Lucene 2.3 (and 2.3.1)

Reply via email to