[ https://issues.apache.org/jira/browse/LUCENE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981649#action_12981649 ]
Nick Pellow edited comment on LUCENE-2666 at 1/14/11 2:17 AM: -------------------------------------------------------------- Hi, I am getting this issue as well? We are doing quite a lot of update updates during indexing. Could this be causing the problem ? This seems to only have happened when we deployed to our linux test server - it didn't appear to occur on MAC OS X during development - with the same data set. Does this only affect Lucene 3.0.2 ? Would a rollback be a good work around ? The exact stack strace: {code} java.lang.ArrayIndexOutOfBoundsException: 5475 at org.apache.lucene.util.BitVector.get(BitVector.java:104) at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:127) at org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:102) at org.apache.lucene.index.SegmentTermDocs.skipTo(SegmentTermDocs.java:207) at org.apache.lucene.search.PhrasePositions.skipTo(PhrasePositions.java:52) at org.apache.lucene.search.PhraseScorer.advance(PhraseScorer.java:120) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:249) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:218) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:199) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:177) at org.apache.lucene.search.MultiSearcher$MultiSearcherCallableWithSort.call(MultiSearcher.java:410) at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:230) at org.apache.lucene.search.Searcher.search(Searcher.java:49) {code} was (Author: npellow): Hi, I am getting this issue as well? We are doing quite a lot of update updates during indexing. Could this be causing the problem ? This seems to only have happened when we deployed to our linux test server - it didn't appear to occur on MAC OS X during development - with the same data set. Does this only affect Lucene 3.0.2 ? Would a rollback be a good work around ? > ArrayIndexOutOfBoundsException when iterating over TermDocs > ----------------------------------------------------------- > > Key: LUCENE-2666 > URL: https://issues.apache.org/jira/browse/LUCENE-2666 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 3.0.2 > Reporter: Shay Banon > > A user got this very strange exception, and I managed to get the index that > it happens on. Basically, iterating over the TermDocs causes an AAOIB > exception. I easily reproduced it using the FieldCache which does exactly > that (the field in question is indexed as numeric). Here is the exception: > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 114 > at org.apache.lucene.util.BitVector.get(BitVector.java:104) > at > org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:127) > at > org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:501) > at > org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:183) > at > org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:470) > at TestMe.main(TestMe.java:56) > It happens on the following segment: _26t docCount: 914 delCount: 1 > delFileName: _26t_1.del > And as you can see, it smells like a corner case (it fails for document > number 912, the AIOOB happens from the deleted docs). The code to recreate it > is simple: > FSDirectory dir = FSDirectory.open(new File("index")); > IndexReader reader = IndexReader.open(dir, true); > IndexReader[] subReaders = reader.getSequentialSubReaders(); > for (IndexReader subReader : subReaders) { > Field field = > subReader.getClass().getSuperclass().getDeclaredField("si"); > field.setAccessible(true); > SegmentInfo si = (SegmentInfo) field.get(subReader); > System.out.println("--> " + si); > if (si.getDocStoreSegment().contains("_26t")) { > // this is the probleatic one... > System.out.println("problematic one..."); > FieldCache.DEFAULT.getLongs(subReader, "__documentdate", > FieldCache.NUMERIC_UTILS_LONG_PARSER); > } > } > Here is the result of a check index on that segment: > 8 of 10: name=_26t docCount=914 > compound=true > hasProx=true > numFiles=2 > size (MB)=1.641 > diagnostics = {optimize=false, mergeFactor=10, > os.version=2.6.18-194.11.1.el5.centos.plus, os=Linux, mergeDocStores=true, > lucene.version=3.0.2 953716 - 2010-06-11 17:13:53, source=merge, > os.arch=amd64, java.version=1.6.0, java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_26t_1.del] > test: open reader.........OK [1 deleted docs] > test: fields..............OK [32 fields] > test: field norms.........OK [32 fields] > test: terms, freq, prox...ERROR [114] > java.lang.ArrayIndexOutOfBoundsException: 114 > at org.apache.lucene.util.BitVector.get(BitVector.java:104) > at > org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:127) > at > org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:102) > at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:616) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:509) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:299) > at TestMe.main(TestMe.java:47) > test: stored fields.......ERROR [114] > java.lang.ArrayIndexOutOfBoundsException: 114 > at org.apache.lucene.util.BitVector.get(BitVector.java:104) > at > org.apache.lucene.index.ReadOnlySegmentReader.isDeleted(ReadOnlySegmentReader.java:34) > at > org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:684) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:512) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:299) > at TestMe.main(TestMe.java:47) > test: term vectors........ERROR [114] > java.lang.ArrayIndexOutOfBoundsException: 114 > at org.apache.lucene.util.BitVector.get(BitVector.java:104) > at > org.apache.lucene.index.ReadOnlySegmentReader.isDeleted(ReadOnlySegmentReader.java:34) > at > org.apache.lucene.index.CheckIndex.testTermVectors(CheckIndex.java:721) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:515) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:299) > at TestMe.main(TestMe.java:47) > The creation of the index does not do something fancy (all defaults), though > there is usage of the near real time aspect (IndexWriter#getReader) which > does complicate deleted docs handling. Seems like the deleted docs got > written without matching the number of docs?. Sadly, I don't have something > that recreates it from scratch, but I do have the index if someone want to > have a look at it (mail me directly and I will provide a download link). > I will continue to investigate why this might happen, just wondering if > someone stumbled on this exception before. Lucene 3.0.2 is used. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org