Got it, I don´t have a clue if this corruption was caused by hardware failure, but that is possible because we suffer with a lot of power failures from time to time. But the thing is that I´ve been using lucene for a long time and I never got this kind of exception.
The thing is that I´d like to delete this document, but I get now another exception: Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 106577 at org.apache.lucene.util.BitVector.set(BitVector.java:53) at org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:301) at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java :674) at org.apache.lucene.index.MultiReader.doDelete(MultiReader.java:125) at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java :674) at teste.DeleteError.main(DeleteError.java:9) Is there a way of fixing my index without having to rebuild it all from the ground? It takes lots of hours to re-index my whole collection. On 7/24/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 7/24/07, Rafael Rossini <[EMAIL PROTECTED]> wrote: > I did a litle debug and found that in the TermScorer, the byte[] norms has > size = 1.119.933, wich is the number of docs on my index, and there is a > docID = 1226511, that is if the "doc" variable in the method is the docID. > > I tried to access this document with reader.document() and got a * > java.io.IOException*: read past EOF. > > Any ideias how to fix or delete this document? That document does not exist (docids are just the index into the array of documents, which only goes up to 1.119.933 (if that's maxDoc()). So the big question is how the "doc" variable got set to 1226511. It sounds like perhaps index corruption to me. The question is if it's due to a bug or a hardware fault. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]