Just to bring closure here: this in fact looks like some sort of JVM hotspot compiler issue, as best we can tell.

Running java with -Xbatch (forces up front compilation) prevents (works around) the issue.

I've committed some additional assertions to the particular Lucene code (merging of stored fields) that was tripping this issue, which will be in 2.3.2 and 2.4.

Mike

Ian Lea wrote:
Mike


Latest patch produces similar exception:

Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.AssertionError: after mergeFields: fdx size mismatch: 65184
docs vs 521464 length in bytes of _c9.fdx
at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException( ConcurrentMergeScheduler.java:320) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:297)
Caused by: java.lang.AssertionError: after mergeFields: fdx size
mismatch: 65184 docs vs 521464 length in bytes of _c9.fdx
at org.apache.lucene.index.SegmentMerger.mergeFields (SegmentMerger.java:347) at org.apache.lucene.index.SegmentMerger.merge (SegmentMerger.java:133) at org.apache.lucene.index.IndexWriter.mergeMiddle (IndexWriter.java:3852) at org.apache.lucene.index.IndexWriter.merge (IndexWriter.java:3504) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge (ConcurrentMergeScheduler.java:211) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:266)

Latest infostream attached.


--
Ian.


On Tue, Mar 18, 2008 at 6:05 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:

 Hi Ian,

 Sheesh that's odd.  The SegmentMerger produced an .fdx file that is
 one document too short.

 Can you run with this patch now, again applied to head of 2.3
 branch?  I just added another assert inside the loop that does the
 field merging.

 I will scrutinize this code...

 Mike




 Ian Lea wrote:
Mike


Patch applied and test re-run and picked up an assertion error this
time:

Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.AssertionError: after mergeFields: fdx size mismatch: 72357
docs vs 578848 length in bytes of _3o.fdx
        at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeExceptio n(
ConcurrentMergeScheduler.java:320)
        at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:297)
Caused by: java.lang.AssertionError: after mergeFields: fdx size
mismatch: 72357 docs vs 578848 length in bytes of _3o.fdx
        at org.apache.lucene.index.SegmentMerger.mergeFields
(SegmentMerger.java:342)
        at org.apache.lucene.index.SegmentMerger.merge
(SegmentMerger.java:133)
        at org.apache.lucene.index.IndexWriter.mergeMiddle
(IndexWriter.java:3852)
        at org.apache.lucene.index.IndexWriter.merge
(IndexWriter.java:3504)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge
(ConcurrentMergeScheduler.java:211)
        at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:266)

The infostream output is attached. Since this email is to you and the
list it should make it to you.



Yonik: I haven't been able to make TestStressIndexing2 fail.


--
Ian.


On Tue, Mar 18, 2008 at 4:19 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:

 Ian,

 Could you apply the attached patch applied to the head of the 2.3
 branch?

 It only adds more asserts, to try to pinpoint where exactly this
 corruption starts.

Then, re-run the test with asserts enabled and infoStream turned on
 and post back.  Thanks.

 Mike




 Ian Lea wrote:

It's failed on servers running SuSE 10.0 and 8.2 (ancient!)

$ uname -a shows
Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005
x86_64
x86_64 x86_64 GNU/Linux

and

Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003
i686
unknown unknown GNU/Linux

The first one has a 2.8Ghz Intel CPU, don't know about the second.


I'll try and run the stress test.


--
Ian.



On Tue, Mar 18, 2008 at 2:17 PM, Yonik Seeley <[EMAIL PROTECTED]>
wrote:

On Tue, Mar 18, 2008 at 7:38 AM, Ian Lea <[EMAIL PROTECTED]> wrote:
Hi


 When bulk loading into a new index I'm seeing this exception

 Exception in thread "Thread-1"
 org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ
for
segment _4l: fieldsReader shows 67861 but segmentInfo shows 67862
        at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:271)
 Caused by: org.apache.lucene.index.CorruptIndexException: doc
counts
differ for segment _4l: fieldsReader shows 67861 but segmentInfo
shows
 67862
        at org.apache.lucene.index.SegmentReader.initialize
(SegmentReader.java:313)
        at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:262)
        at org.apache.lucene.index.SegmentReader.get
(SegmentReader.java:221)
        at org.apache.lucene.index.IndexWriter.mergeMiddle
(IndexWriter.java:3093)
        at org.apache.lucene.index.IndexWriter.merge
(IndexWriter.java:2834)
        at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:240)

when use java version 1.6.0_05-b13 or 1.6.0_04-b12 on linux, with lucene 2.3.0 or 2.3.1 or lucene-core-2.3-SNAPSHOT from yesterday.

 With java version 1.6.0_03-b05 things work fine.

 The exception happens a few hundred thousand documents into the
load.

 A different program updating a different index with different
data on
 a different server gave a similar error on version 1.6.0_05-
b13 and
 lucene 2.3.0.

 Any ideas?  Is this maybe a known issue or am I missing
something obvious?

 My guess is perhaps a thread safety bug, more likely in Lucene
 indexing code (less likely in the JVM or specific libc).

 What Linux version are you using?
 What hardware are you running on (specifically, the CPU)?

 If possible, it would be great if you could check out Lucene
trunk,
 crank up the iterations by modifying the TestStressIndexing2 and
maybe
 fiddle with some of the other parameters in
 TestStressIndexing2.testMultiConfig(), and see if you can get
it to
 fail.


 -Yonik


----------------------------------------------------------------- --
--


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: java-user- [EMAIL PROTECTED]



------------------------------------------------------------------ --
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




<infostream.zip>



<infostream.zip>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to