I don't see an attachment here -- maybe the mailing list software
stripped it off. If so can you send directly to me? Thanks.
Mike
Ian Lea wrote:
Documents are biblio records. All have title, author etc. stored,
some have a few extra fields as well. Typically around 25 fields per
doc. The index is created with compound format, everything else as
default.
I've rerun the job until failure. Different numbers this time, but
basically the same exception. In the failing pass it was trying to
load 100K docs.
Exception in thread "Thread-1"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _cb: fieldsReader shows 71663 but segmentInfo shows 71664
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _cb: fieldsReader shows 71663 but segmentInfo shows
71664
at org.apache.lucene.index.SegmentReader.initialize
(SegmentReader.java:313)
at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:262)
at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:221)
at org.apache.lucene.index.IndexWriter.mergeMiddle
(IndexWriter.java:3099)
at org.apache.lucene.index.IndexWriter.merge
(IndexWriter.java:2834)
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:240)
As mentioned, the index is loaded in chunks - the infostream from the
failing pass is attached. All infostreams from all 19 odd runs leading
up to the failure available as well if that would help.
Running with -ea doesn't seem to have made any difference.
--
Ian.
On Tue, Mar 18, 2008 at 12:09 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
Can you call IndexWriter.setInfoStream(...) and get the error to
happen and post back the resulting output? And, turn on assertions
(java -ea) since that may catch the issue sooner.
Can you describe you are setting up IndexWriter (autoCommit,
compound, etc.), and what your documents are like? Do your
documents
have a fixed schema (same fields every time), or it varies such that
some documents have no stored fields and some do?
Mike
Ian Lea wrote:
Hi
When bulk loading into a new index I'm seeing this exception
Exception in thread "Thread-1"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _4l: fieldsReader shows 67861 but segmentInfo shows 67862
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run
(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _4l: fieldsReader shows 67861 but segmentInfo
shows
67862
at org.apache.lucene.index.SegmentReader.initialize
(SegmentReader.java:313)
at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:262)
at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:221)
at org.apache.lucene.index.IndexWriter.mergeMiddle
(IndexWriter.java:3093)
at org.apache.lucene.index.IndexWriter.merge
(IndexWriter.java:2834)
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run
(ConcurrentMergeScheduler.java:240)
when use java version 1.6.0_05-b13 or 1.6.0_04-b12 on linux, with
lucene 2.3.0 or 2.3.1 or lucene-core-2.3-SNAPSHOT from yesterday.
With java version 1.6.0_03-b05 things work fine.
The exception happens a few hundred thousand documents into the
load.
A different program updating a different index with different
data on
a different server gave a similar error on version 1.6.0_05-b13 and
lucene 2.3.0.
Any ideas? Is this maybe a known issue or am I missing something
obvious?
--
Ian.
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]