By "does not help" do you mean CheckIndex never detects this
corruption, yet you then hit that exception when searching?
By "reopening fails" what do you mean? I thought reopen works fine,
but then it's only the search that fails?
Mike
Sascha Fahl wrote:
Checking the index after adding documents and befor reopening the
IndexReader does not help. After adding documents nothing bad
happens and CheckIndex says the index is all right. But when I check
the index before reopen it
CheckIndex does not detect any corruption and says the index is ok
and reopening fails.
Sascha
Am 30.06.2008 um 18:34 schrieb Michael McCandless:
This is spooky: that exception means you have some sort of index
corruption. The TermScorer thinks it found a doc ID 37389, which
is out of bounds.
Reopening IndexReader while IndexWriter is writing should be
completely fine.
Is this easily reproduced? If so, if you could narrow it down to
sequence of added documents, that'd be awesome.
It's very strange that you see the corruption go away. Can you run
CheckIndex (java org.apache.lucene.index.CheckIndex <indexDir>) to
see if it detects any corruption. In fact, if you could run
CheckIndex after each session of IndexWriter to isolate which batch
of added documents causes the corruption, that could help us narrow
it down.
Are you changing any of the settings in IndexWriter? Are you using
multiple threads? Which exact JRE version and OS are you using?
Are you creating a new index at the start of each run?
Mike
Sascha Fahl wrote:
Hi,
I see some strange behavoiur of lucene. The following scenario.
While adding documents to my index (every doc is pretty small, doc-
count is about 12000) I have implemented a custom behaviour of
flushing and committing documents to the index. Before adding
documents to the index I check if wether der ramDocCount has
reached a certain number of if the last commit is a while ago. If
so i flush the buffered documents and reopen the IndexWriter. So
far, so good. Indexing works very well. The problem is that if I
send requests with die IndexReader while writing documents with
the IndexWriter (I send around 10.000 requests to lucene) I reopen
the IndexReader every 100 requests (only for testing) if the
IndexReader is not current. The first around 4000 requests work
very well, but afterwards I always get the following exception:
java.lang.ArrayIndexOutOfBoundsException: 37389
at org.apache.lucene.search.TermScorer.score(TermScorer.java:126)
at
org.apache.lucene.util.ScorerDocQueue.topScore(ScorerDocQueue.java:
112)
at
org
.apache
.lucene
.search
.DisjunctionSumScorer
.advanceAfterCurrent(DisjunctionSumScorer.java:172)
at
org
.apache
.lucene.search.DisjunctionSumScorer.next(DisjunctionSumScorer.java:
146)
at
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:
319)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:
146)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:
113)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:100)
at org.apache.lucene.search.Hits.<init>(Hits.java:67)
at org.apache.lucene.search.Searcher.search(Searcher.java:46)
at org.apache.lucene.search.Searcher.search(Searcher.java:38)
This seems to be a temporarily problem because opening a new
IndexReader after all documents were added everything is ok again
and the 10.000 requests are all right.
So what could be the problem here?
reg,
sascha
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]