Got it,

   I don´t have a clue if this corruption was caused by hardware failure,
but that is possible because we suffer with a lot of power failures from
time to time. But the thing is that I´ve been using lucene for a long time
and I never got this kind of exception.

   The thing is that I´d like to delete this document, but I get now
another exception:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array
index out of range: 106577
  at org.apache.lucene.util.BitVector.set(BitVector.java:53)
  at org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:301)
  at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java
:674)
  at org.apache.lucene.index.MultiReader.doDelete(MultiReader.java:125)
  at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java
:674)
  at teste.DeleteError.main(DeleteError.java:9)

Is there a way of fixing my index without having to rebuild it all from the
ground? It takes lots of hours to re-index my whole collection.

On 7/24/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:

On 7/24/07, Rafael Rossini <[EMAIL PROTECTED]> wrote:
> I did a litle debug and found that in the TermScorer, the byte[] norms
has
> size = 1.119.933, wich is the number of docs on my index, and there is a
> docID = 1226511, that is if the "doc" variable in the method is the
docID.
>
> I tried to access this document with reader.document() and got a *
> java.io.IOException*: read past EOF.
>
> Any ideias how to fix or delete this document?

That document does not exist (docids are just the index into the array
of documents, which only goes up to 1.119.933 (if that's maxDoc()).
So the big question is how the "doc" variable got set to 1226511.

It sounds like perhaps index corruption to me.  The question is if
it's due to a bug or a hardware fault.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to