The exception you hit during optimize indicates some corruption in the
stored fields file (_X.fdt).  Then, the exception hit during
CheckIndex was different -- the postings entry (_r0y.frq) for term
"desthost:wir" was supposed to have 6 entries but had 0.

Did "CheckIndex -fix" allow you to optimize successfully?  In which
case, both of these different corruptions were in the same segment
(_r0y) that CheckIndex removed.

Was this index fully created with 2.4.0 or a prior release?

Which exact JRE was used to create/write to this index?  And which one
was used for searching it?

How often do you hit this?  Is it repeatable?

It's remotely possible you have bad hardware (RAM, hard drive)

Most likely it's gone... but if somehow I could get access to that bad
segment, I could try to dig.

vivek sar wrote:

Hi,

 We ran into the same issue (corrupted index) using Lucene 2.4.0.
There was no outage or system reboot - not sure how could it get
corrupted. Here is the exception,

Caused by: java.io.IOException: background merge hit exception:
_io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
_uh5:c2065961 _r0y:c1124653 _s4s:c2477731 _t6w:c4340938 _ucx:c8018451
_xkb:c13842776 _xkd:c3394 _xke:c3379 _xkg:c1231 _xkh:c1252 _xkj:c1680
_xkk:c1689 into _xkl [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2258) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2194) at com .packetmotion .manager .query.fulltext.index.Indexer.optimizeMasterIndex(Indexer.java:887) at com .packetmotion .manager .query.fulltext.index.Indexer.createNewIndexPartition(Indexer.java: 783) at com .packetmotion .manager.query.fulltext.index.Indexer.indexByDevice(Indexer.java:582) at com .packetmotion .manager.query.fulltext.index.Indexer.indexData(Indexer.java:440) at com.packetmotion.manager.query.fulltext.index.Indexer$ $FastClassByCGLIB$$97fb7e9b.invoke(<generated>)
       at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
at org.springframework.aop.framework.Cglib2AopProxy $DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:628) at com.packetmotion.manager.query.fulltext.index.Indexer$ $EnhancerByCGLIB$$ebbe3914.indexData(<generated>) at com.packetmotion.manager.query.fulltext.index.IndexerJob $1.run(IndexerJob.java:38)
       ... 8 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 51, Size: 26
       at java.util.ArrayList.RangeCheck(ArrayList.java:547)
       at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:265) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 185) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:729) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 359) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4226) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java: 3877) at org .apache .lucene .index .ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:260)


I ran the checkIndex tool to fix the corrupted index. Here is its output,


Opening index @ /opt/ps_manager/apps/conf/index/MasterIndex

Segments file=segments_1587 numSegments=18 version=FORMAT_HAS_PROX [Lucene 2.4]
 1 of 18: name=_io5 docCount=66777491
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=6,680.574
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [30 fields]
   test: terms, freq, prox...OK [368415 terms; 1761057989 terms/docs
pairs; 2095636359 tokens]
   test: stored fields.......OK [66777491 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 2 of 18: name=_nh9 docCount=10656736
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=1,058.869
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [30 fields]
   test: terms, freq, prox...OK [118964 terms; 278176718 terms/docs
pairs; 329825350 tokens]
   test: stored fields.......OK [10656736 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 3 of 18: name=_taq docCount=2021563
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=208.544
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [30 fields]
   test: terms, freq, prox...OK [145494 terms; 54131697 terms/docs
pairs; 65411701 tokens]
   test: stored fields.......OK [2021563 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 4 of 18: name=_s8m docCount=1421051
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=162.443
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [30 fields]
   test: terms, freq, prox...OK [210276 terms; 42491363 terms/docs
pairs; 53054214 tokens]
   test: stored fields.......OK [1421051 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 5 of 18: name=_uh5 docCount=2065961
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=229.394
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [30 fields]
   test: terms, freq, prox...OK [126789 terms; 60416663 terms/docs
pairs; 75120265 tokens]
   test: stored fields.......OK [2065964 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 6 of 18: name=_r0y docCount=1124653
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=115.792
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [26 fields]
   test: terms, freq, prox...FAILED
WARNING: fixIndex() would remove reference to this segment; full exception:
java.lang.RuntimeException: term desthost:wir docFreq=6 != num docs
seen 0 + num docs deleted 0
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:475)
       at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:676)

 7 of 18: name=_s4s docCount=2477731
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=230.698
   no deletions
   test: open reader.........PuTTYPuTTYOK
   test: fields, norms.......OK [25 fields]
   test: terms, freq, prox...OK [90327 terms; 60212096 terms/docs
pairs; 72740383 tokens]
   test: stored fields.......OK [2477731 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 8 of 18: name=_t6w docCount=4340938
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=451.389
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [26 fields]
   test: terms, freq, prox...OK [157817 terms; 121753990 terms/docs
pairs; 151141568 tokens]
   test: stored fields.......OK [4340938 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 9 of 18: name=_ucx docCount=8018451
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=845.968
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [26 fields]
   test: terms, freq, prox...OK [410057 terms; 227617455 terms/docs
pairs; 283398975 tokens]
   test: stored fields.......OK [8018451 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 10 of 18: name=_xl9 docCount=13891933
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=1,408.881
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [26 fields]
   test: terms, freq, prox...OK [535226 terms; 376394493 terms/docs
pairs; 465003060 tokens]
   test: stored fields.......OK [13891933 total field count; avg 1
fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 11 of 18: name=_xlb docCount=3521
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.411
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [25 fields]
   test: terms, freq, prox...OK [2070 terms; 108496 terms/docs pairs;
136365 tokens]
test: stored fields.......OK [3521 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 12 of 18: name=_xlc docCount=3529
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.412
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [22 fields]
   test: terms, freq, prox...OK [2214 terms; 108633 terms/docs pairs;
136384 tokens]
test: stored fields.......OK [3529 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 13 of 18: name=_xle docCount=1401
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.188
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [21 fields]
   test: terms, freq, prox...OK [1854 terms; 48703 terms/docs pairs;
62221 tokens]
test: stored fields.......OK [1401 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 14 of 18: name=_xlf docCount=1399
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.188
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [21 fields]
   test: terms, freq, prox...OK [1763 terms; 48910 terms/docs pairs;
63035 tokens]
test: stored fields.......OK [1399 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 15 of 18: name=_xlh docCount=1727
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.24
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [21 fields]
   test: terms, freq, prox...OK [2318 terms; 62596 terms/docs pairs;
80688 tokens]
test: stored fields.......OK [1727 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 16 of 18: name=_xli docCount=1716
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.237
   no deletions
   test: open reader.........OK
    test: fields, norms.......OK [21 fields]
   test: terms, freq, prox...OK [2264 terms; 61867 terms/docs pairs;
79497 tokens]
test: stored fields.......OK [1716 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 17 of 18: name=_xlk docCount=2921
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.364
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [25 fields]
   test: terms, freq, prox...OK [2077 terms; 96536 terms/docs pairs;
123166 tokens]
test: stored fields.......OK [2921 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

 18 of 18: name=_xll docCount=3876
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=0.476
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [21 fields]
   test: terms, freq, prox...OK [2261 terms; 130104 terms/docs pairs;
166867 tokens]
test: stored fields.......OK [3876 total field count; avg 1 fields per doc]
   test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]

WARNING: 1 broken segments (containing 1124653 documents) detected
WARNING: 1124653 documents will be lost

NOTE: will write new segments file in 5 seconds; this will remove
1124653 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
 5...
 4...
 3...
 2...
 1...
Writing...
OK
Wrote new segments file "segments_1588"


Any ideas how would the index get corrupted?

Thanks,
-vivek


On Sat, Jan 3, 2009 at 8:04 AM, Brian Whitman <br...@echonest.com> wrote:


It's very strange that CheckIndex -fix did not resolve the issue. After fixing it, if you re-run CheckIndex on the index do you still see that
original one broken segment present?  CheckIndex should have removed
reference to that one segment.


I just ran it again, and it detected the same error and claimed to fix it. I then shut down the solr server (I wasn't sure if this would be an issue), ran it a third time (where it again found and claimed to fix the error), then a fourth where it did not find any problems, and now the optimize()
call on the running server does not throw the merge exception.

Did this corruption happen only once? (You mentioned hitting dups in
the past...
but did you also see corruption too?)


Not that we know of, but it's very likely we never noticed. (The only reason I discovered this was our commits were taking 20-40x longer on this index
than others)


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to