Hi,

  We ran into the same issue (corrupted index) using Lucene 2.4.0.
There was no outage or system reboot - not sure how could it get
corrupted. Here is the exception,

Caused by: java.io.IOException: background merge hit exception:
_io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
_uh5:c2065961 _r0y:c1124653 _s4s:c2477731 _t6w:c4340938 _ucx:c8018451
_xkb:c13842776 _xkd:c3394 _xke:c3379 _xkg:c1231 _xkh:c1252 _xkj:c1680
_xkk:c1689 into _xkl [optimize]
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2258)
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2194)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer.optimizeMasterIndex(Indexer.java:887)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer.createNewIndexPartition(Indexer.java:783)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer.indexByDevice(Indexer.java:582)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer.indexData(Indexer.java:440)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer$$FastClassByCGLIB$$97fb7e9b.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
        at 
org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:628)
        at 
com.packetmotion.manager.query.fulltext.index.Indexer$$EnhancerByCGLIB$$ebbe3914.indexData(<generated>)
        at 
com.packetmotion.manager.query.fulltext.index.IndexerJob$1.run(IndexerJob.java:38)
        ... 8 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 51, Size: 26
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:265)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:185)
        at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:729)
        at 
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:359)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4226)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3877)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)


I ran the checkIndex tool to fix the corrupted index. Here is its output,


Opening index @ /opt/ps_manager/apps/conf/index/MasterIndex

Segments file=segments_1587 numSegments=18 version=FORMAT_HAS_PROX [Lucene 2.4]
  1 of 18: name=_io5 docCount=66777491
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=6,680.574
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [368415 terms; 1761057989 terms/docs
pairs; 2095636359 tokens]
    test: stored fields.......OK [66777491 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  2 of 18: name=_nh9 docCount=10656736
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,058.869
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [118964 terms; 278176718 terms/docs
pairs; 329825350 tokens]
    test: stored fields.......OK [10656736 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  3 of 18: name=_taq docCount=2021563
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=208.544
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [145494 terms; 54131697 terms/docs
pairs; 65411701 tokens]
    test: stored fields.......OK [2021563 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  4 of 18: name=_s8m docCount=1421051
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=162.443
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [210276 terms; 42491363 terms/docs
pairs; 53054214 tokens]
    test: stored fields.......OK [1421051 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  5 of 18: name=_uh5 docCount=2065961
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=229.394
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [126789 terms; 60416663 terms/docs
pairs; 75120265 tokens]
    test: stored fields.......OK [2065964 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  6 of 18: name=_r0y docCount=1124653
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=115.792
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...FAILED
    WARNING: fixIndex() would remove reference to this segment; full exception:
java.lang.RuntimeException: term desthost:wir docFreq=6 != num docs
seen 0 + num docs deleted 0
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:475)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:676)

  7 of 18: name=_s4s docCount=2477731
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=230.698
    no deletions
    test: open reader.........PuTTYPuTTYOK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [90327 terms; 60212096 terms/docs
pairs; 72740383 tokens]
    test: stored fields.......OK [2477731 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  8 of 18: name=_t6w docCount=4340938
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=451.389
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [157817 terms; 121753990 terms/docs
pairs; 151141568 tokens]
    test: stored fields.......OK [4340938 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  9 of 18: name=_ucx docCount=8018451
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=845.968
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [410057 terms; 227617455 terms/docs
pairs; 283398975 tokens]
    test: stored fields.......OK [8018451 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  10 of 18: name=_xl9 docCount=13891933
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,408.881
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [535226 terms; 376394493 terms/docs
pairs; 465003060 tokens]
    test: stored fields.......OK [13891933 total field count; avg 1
fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  11 of 18: name=_xlb docCount=3521
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.411
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2070 terms; 108496 terms/docs pairs;
136365 tokens]
    test: stored fields.......OK [3521 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  12 of 18: name=_xlc docCount=3529
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.412
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [22 fields]
    test: terms, freq, prox...OK [2214 terms; 108633 terms/docs pairs;
136384 tokens]
    test: stored fields.......OK [3529 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  13 of 18: name=_xle docCount=1401
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1854 terms; 48703 terms/docs pairs;
62221 tokens]
    test: stored fields.......OK [1401 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  14 of 18: name=_xlf docCount=1399
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1763 terms; 48910 terms/docs pairs;
63035 tokens]
    test: stored fields.......OK [1399 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  15 of 18: name=_xlh docCount=1727
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.24
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2318 terms; 62596 terms/docs pairs;
80688 tokens]
    test: stored fields.......OK [1727 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  16 of 18: name=_xli docCount=1716
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.237
    no deletions
    test: open reader.........OK
     test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2264 terms; 61867 terms/docs pairs;
79497 tokens]
    test: stored fields.......OK [1716 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  17 of 18: name=_xlk docCount=2921
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.364
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2077 terms; 96536 terms/docs pairs;
123166 tokens]
    test: stored fields.......OK [2921 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

  18 of 18: name=_xll docCount=3876
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.476
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2261 terms; 130104 terms/docs pairs;
166867 tokens]
    test: stored fields.......OK [3876 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

WARNING: 1 broken segments (containing 1124653 documents) detected
WARNING: 1124653 documents will be lost

NOTE: will write new segments file in 5 seconds; this will remove
1124653 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
  5...
  4...
  3...
  2...
  1...
Writing...
OK
Wrote new segments file "segments_1588"


Any ideas how would the index get corrupted?

Thanks,
-vivek


On Sat, Jan 3, 2009 at 8:04 AM, Brian Whitman <br...@echonest.com> wrote:
>>
>>
>> It's very strange that CheckIndex -fix did not resolve the issue.  After
>> fixing it, if you re-run CheckIndex on the index do you still see that
>> original one broken segment present?  CheckIndex should have removed
>> reference to that one segment.
>>
>
> I just ran it again, and it detected the same error and claimed to fix it. I
> then shut down the solr server (I wasn't sure if this would be an issue),
> ran it a third time (where it again found and claimed to fix the error),
> then a fourth where it did not find any problems, and now the optimize()
> call on the running server does not throw the merge exception.
>
>  Did this corruption happen only once?  (You mentioned hitting dups in
> the past...
>> but did you also see corruption too?)
>
>
> Not that we know of, but it's very likely we never noticed. (The only reason
> I discovered this was our commits were taking 20-40x longer on this index
> than others)
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to