Mike -- Thanks so much for the prompt reply. You are right, we are accessing these documents with multiple threads (and have always been). However, I am wondering if the increased indexing speed in 2.3 has revealed a hidden concurrency issue.
I am going to add in some additional concurrency checks, if I have questions I will post another question to the list. Thanks again for the reply. Tyler On Fri, Feb 29, 2008 at 2:46 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > Not good! (I'm sorry). > > That first exception is worrisome. It's the root cause here. > > Can you describe your documents? That exception, if I'm reading it > right, seems to imply that you have documents with 4762 fields. Is > that right? > > Are you using multiple threads? Is it possible that you cal > addDocument with a given document, but another thread removing fields > from that document (or, removing elements from the List returned by > Document.getFields())? That's the only way I can explain that first > exception. > > Mike > > > > Tyler V wrote: > > > After upgrading to Lucene 2.3 (and subsequently 2.3.1), our > > application has experienced sporadic index corruptions on our larger > > (and more frequently updated) indexes. These indexes experienced no > > corruptions under any prior version of Lucene (which we have been > > using for several years). > > > > The pattern of failure seems to be consistent. First, we receive an > > exception like the following: > > > > java.lang.IndexOutOfBoundsException: Index: 4788, Size: 4762 > > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > > at java.util.ArrayList.get(ArrayList.java:322) > > at org.apache.lucene.index.DocumentsWriter$ThreadState.init > > (DocumentsWriter.java:749) > > at org.apache.lucene.index.DocumentsWriter.getThreadState > > (DocumentsWriter.java:2391) > > at org.apache.lucene.index.DocumentsWriter.updateDocument > > (DocumentsWriter.java:2434) > > at org.apache.lucene.index.DocumentsWriter.addDocument > > (DocumentsWriter.java:2422) > > at org.apache.lucene.index.IndexWriter.addDocument > > (IndexWriter.java:1445) > > at org.apache.lucene.index.IndexWriter.addDocument > > (IndexWriter.java:1424) > > at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java: > > 134) > > at java.lang.Thread.run(Thread.java:619) > > > > When we experience this error, we run a writer.flush() then a > > writer.close(). > > > > Then, we get this exception when trying to re-open the index: > > > > org.apache.lucene.index.CorruptIndexException: doc counts differ for > > segment _c2z13: fieldsReader shows 2 but segmentInfo shows 3 > > at org.apache.lucene.index.SegmentReader.initialize > > (SegmentReader.java:313) > > at org.apache.lucene.index.SegmentReader.get > > (SegmentReader.java:262) > > at org.apache.lucene.index.SegmentReader.get > > (SegmentReader.java:197) > > at org.apache.lucene.index.MultiSegmentReader.<init> > > (MultiSegmentReader.java:55) > > at org.apache.lucene.index.DirectoryIndexReader$1.doBody > > (DirectoryIndexReader.java:75) > > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run > > (SegmentInfos.java:636) > > at org.apache.lucene.index.DirectoryIndexReader.open > > (DirectoryIndexReader.java:63) > > at org.apache.lucene.index.IndexReader.open > > (IndexReader.java:209) > > at org.apache.lucene.index.IndexReader.open > > (IndexReader.java:192) > > at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java: > > 107) > > at java.lang.Thread.run(Thread.java:619) > > > > Running the check index application included with 2.3 enables us to > > remove the bad documents from the index, but this workaround is less > > than desirable. It would be greatly appreciated if anyone could shed > > some light on our issue. > > > > Regards, > > > > Tyler > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]