I don't see an attachment here -- maybe the mailing list software stripped it off. If so can you send directly to me? Thanks.

Mike

Ian Lea wrote:

Documents are biblio records.  All have title, author etc. stored,
some have a few extra fields as well.  Typically around 25 fields per
doc.  The index is created with compound format, everything else as
default.

I've rerun the job until failure.  Different numbers this time, but
basically the same exception. In the failing pass it was trying to
load 100K docs.

Exception in thread "Thread-1"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _cb: fieldsReader shows 71663 but segmentInfo shows 71664
at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _cb: fieldsReader shows 71663 but segmentInfo shows
 71664
at org.apache.lucene.index.SegmentReader.initialize (SegmentReader.java:313) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:221) at org.apache.lucene.index.IndexWriter.mergeMiddle (IndexWriter.java:3099) at org.apache.lucene.index.IndexWriter.merge (IndexWriter.java:2834) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:240)

As mentioned, the index is loaded in chunks - the infostream from the
failing pass is attached. All infostreams from all 19 odd runs leading
up to the failure available as well if that would help.


Running with -ea doesn't seem to have made any difference.


--
Ian.


On Tue, Mar 18, 2008 at 12:09 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:

 Can you call IndexWriter.setInfoStream(...) and get the error to
 happen and post back the resulting output?  And, turn on assertions
 (java -ea) since that may catch the issue sooner.

 Can you describe you are setting up IndexWriter (autoCommit,
compound, etc.), and what your documents are like? Do your documents
 have a fixed schema (same fields every time), or it varies such that
 some documents have no stored fields and some do?

 Mike



 Ian Lea wrote:

Hi


When bulk loading into a new index I'm seeing this exception

Exception in thread "Thread-1"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _4l: fieldsReader shows 67861 but segmentInfo shows 67862
at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run
(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
differ for segment _4l: fieldsReader shows 67861 but segmentInfo shows
67862
      at org.apache.lucene.index.SegmentReader.initialize
(SegmentReader.java:313)
at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get (SegmentReader.java:221)
      at org.apache.lucene.index.IndexWriter.mergeMiddle
(IndexWriter.java:3093)
at org.apache.lucene.index.IndexWriter.merge (IndexWriter.java:2834) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run
(ConcurrentMergeScheduler.java:240)

when use java version 1.6.0_05-b13 or 1.6.0_04-b12 on linux, with
lucene 2.3.0 or 2.3.1 or lucene-core-2.3-SNAPSHOT from yesterday.

With java version 1.6.0_03-b05 things work fine.

The exception happens a few hundred thousand documents into the load.

A different program updating a different index with different data on
a different server gave a similar error on version 1.6.0_05-b13 and
lucene 2.3.0.


Any ideas?  Is this maybe a known issue or am I missing something
obvious?



--
Ian.

-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to