OK given that you're seeing non-deterministic results on the same index... I think this is likely a hardware issue or a JRE bug?
If you move that index over to another env and run CheckIndex, is it consistent? Mike On Fri, Jan 14, 2011 at 9:00 AM, Stéphane Delprat <stephane.delp...@blogspirit.com> wrote: > So I ran checkIndex (without -fix) 5 times in a row : > > SOLR was running, but no client connected to it. (just the slave which was > synchronizing every 5 minutes) > > summary : > > 1: all good > 2: 2 errors: (seg 1 & 2) terms, freq, prox...ERROR [term blog_id:104150: doc > 324697 <= lastDoc 324697] & terms, freq, prox...ERROR [term > blog_keywords:SPORT: doc 174808 <= lastDoc 174808] > 3: 1 error: (seg 2) terms, freq, prox...ERROR [Index: 105, Size: 51] > 4: all good > 5: 1 error: (seg 7) terms, freq, prox...ERROR [term blog_comments: %X > docFreq=1 != num docs seen 0 + num docs deleted 0] > > Seams to me that some random things are happening here. > > File system is ext3, on a physical server. > > > Here are the logs of the interesting segments : > > ** 1 ** > > 1 of 17: name=_nqt docCount=431889 > compound=false > hasProx=true > numFiles=9 > size (MB)=1,671.375 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_nqt_1y2.del] > test: open reader.........OK [41918 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs pairs; > 59357374 tokens] > test: stored fields.......OK [11505678 total field count; avg 29.504 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 2 of 17: name=_ol7 docCount=913886 > compound=false > hasProx=true > numFiles=9 > size (MB)=3,567.739 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_ol7_1mc.del] > test: open reader.........OK [74076 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs pairs; > 132337348 tokens] > test: stored fields.......OK [26933113 total field count; avg 32.07 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > ** 2 ** > > 1 of 17: name=_nqt docCount=431889 > compound=false > hasProx=true > numFiles=9 > size (MB)=1,671.375 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0 > _20, java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_nqt_1y2.del] > test: open reader.........OK [41918 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...ERROR [term blog_id:104150: doc 324697 <= > lastDoc 324697] > java.lang.RuntimeException: term blog_id:104150: doc 324697 <= lastDoc > 324697 > at > org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > test: stored fields.......OK [11505678 total field count; avg 29.504 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > FAILED > WARNING: fixIndex() would remove reference to this segment; full > exception: > java.lang.RuntimeException: Term Index test failed > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > > 2 of 17: name=_ol7 docCount=913886 > compound=false > hasProx=true > numFiles=9 > size (MB)=3,567.739 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_ol7_1mc.del] > test: open reader.........OK [74076 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...ERROR [term blog_keywords:SPORT: doc 174808 <= > lastDoc 174808] > java.lang.RuntimeException: term blog_keywords:SPORT: doc 174808 <= lastDoc > 174808 > at > org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > test: stored fields.......OK [26933113 total field count; avg 32.07 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > FAILED > WARNING: fixIndex() would remove reference to this segment; full > exception: > java.lang.RuntimeException: Term Index test failed > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > > ** 3 ** > > 1 of 17: name=_nqt docCount=431889 > compound=false > hasProx=true > numFiles=9 > size (MB)=1,671.375 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_nqt_1y2.del] > test: open reader.........OK [41918 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs pairs; > 59357374 tokens] > test: stored fields.......OK [11505678 total field count; avg 29.504 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 2 of 17: name=_ol7 docCount=913886 > compound=false > hasProx=true > numFiles=9 > size (MB)=3,567.739 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_ol7_1mc.del] > test: open reader.........OK [74076 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...ERROR [Index: 105, Size: 51] > java.lang.IndexOutOfBoundsException: Index: 105, Size: 51 > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > at java.util.ArrayList.get(ArrayList.java:322) > at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:288) > at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:277) > at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:86) > at > org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129) > at > org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:631) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > test: stored fields.......OK [26933113 total field count; avg 32.07 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > FAILED > WARNING: fixIndex() would remove reference to this segment; full > exception: > java.lang.RuntimeException: Term Index test failed > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > > ** 4 ** > > 1 of 17: name=_nqt docCount=431889 > compound=false > hasProx=true > numFiles=9 > size (MB)=1,671.375 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_nqt_1y2.del] > test: open reader.........OK [41918 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs pairs; > 59357374 tokens] > test: stored fields.......OK [11505678 total field count; avg 29.504 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 2 of 17: name=_ol7 docCount=913886 > compound=false > hasProx=true > numFiles=9 > size (MB)=3,567.739 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_ol7_1mc.del] > test: open reader.........OK [74076 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs pairs; > 132337349 tokens] > test: stored fields.......OK [26933113 total field count; avg 32.07 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 7 of 17: name=_qqy docCount=69109 > compound=false > hasProx=true > numFiles=9 > size (MB)=468.861 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_qqy_18.del] > test: open reader.........OK [22672 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [1431363 terms; 10684603 terms/docs pairs; > 13097892 tokens] > test: stored fields.......OK [1688599 total field count; avg 36.363 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > ** 5 ** > > 1 of 17: name=_nqt docCount=431889 > compound=false > hasProx=true > numFiles=9 > size (MB)=1,671.375 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_nqt_1y2.del] > test: open reader.........OK [41918 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs pairs; > 59357374 tokens] > test: stored fields.......OK [11505678 total field count; avg 29.504 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 2 of 17: name=_ol7 docCount=913886 > compound=false > hasProx=true > numFiles=9 > size (MB)=3,567.739 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, > java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_ol7_1mc.del] > test: open reader.........OK [74076 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs pairs; > 132337348 tokens] > test: stored fields.......OK [26933113 total field count; avg 32.07 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > > 7 of 17: name=_qqy docCount=69109 > compound=false > hasProx=true > numFiles=9 > size (MB)=468.861 > diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64, > os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 > 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0 > _20, java.vendor=Sun Microsystems Inc.} > has deletions [delFileName=_qqy_18.del] > test: open reader.........OK [22672 deleted docs] > test: fields..............OK [51 fields] > test: field norms.........OK [51 fields] > test: terms, freq, prox...ERROR [term blog_comments: %X docFreq=1 != num > docs seen 0 + num docs deleted 0] > java.lang.RuntimeException: term blog_comments: %X docFreq=1 != num docs > seen 0 + num docs deleted 0 > at > org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > test: stored fields.......OK [1688599 total field count; avg 36.363 > fields per doc] > test: term vectors........OK [0 total vector count; avg 0 term/freq > vector fields per doc] > FAILED > WARNING: fixIndex() would remove reference to this segment; full > exception: > java.lang.RuntimeException: Term Index test failed > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) > > > > Thanks, > > > Le 14/01/2011 12:59, Michael McCandless a écrit : >> >> Right, but removing a segment out from under a live IW (when you run >> CheckIndex with -fix) is deadly, because that other IW doesn't know >> you've removed the segment, and will later commit a new segment infos >> still referencing that segment. >> >> The nature of this particular exception from CheckIndex is very >> strange... I think it can only be a bug in Lucene, a bug in the JRE or >> a hardware issue (bits are flipping somewhere). >> >> I don't think an error in the IO system can cause this particular >> exception (it would cause others), because the deleted docs are loaded >> up front when SegmentReader is init'd... >> >> This is why I'd really like to see if a given corrupt index always >> hits precisely the same exception if you run CheckIndex more than >> once. >> >> Mike >> >> On Thu, Jan 13, 2011 at 10:56 PM, Lance Norskog<goks...@gmail.com> wrote: >>> >>> 1) CheckIndex is not supposed to change a corrupt segment, only remove >>> it. >>> 2) Are you using local hard disks, or do run on a common SAN or remote >>> file server? I have seen corruption errors on SANs, where existing >>> files have random changes. >>> >>> On Thu, Jan 13, 2011 at 11:06 AM, Michael McCandless >>> <luc...@mikemccandless.com> wrote: >>>> >>>> Generally it's not safe to run CheckIndex if a writer is also open on >>>> the index. >>>> >>>> It's not safe because CheckIndex could hit FNFE's on opening files, >>>> or, if you use -fix, CheckIndex will change the index out from under >>>> your other IndexWriter (which will then cause other kinds of >>>> corruption). >>>> >>>> That said, I don't think the corruption that CheckIndex is detecting >>>> in your index would be caused by having a writer open on the index. >>>> Your first CheckIndex has a different deletes file (_phe_p3.del, with >>>> 44824 deleted docs) than the 2nd time you ran it (_phe_p4.del, with >>>> 44828 deleted docs), so it must somehow have to do with that change. >>>> >>>> One question: if you have a corrupt index, and run CheckIndex on it >>>> several times in a row, does it always fail in the same way? (Ie the >>>> same term hits the below exception). >>>> >>>> Is there any way I could get a copy of one of your corrupt cases? I >>>> can then dig... >>>> >>>> Mike >>>> >>>> On Thu, Jan 13, 2011 at 10:52 AM, Stéphane Delprat >>>> <stephane.delp...@blogspirit.com> wrote: >>>>> >>>>> I understand less and less what is happening to my solr. >>>>> >>>>> I did a checkIndex (without -fix) and there was an error... >>>>> >>>>> So a did another checkIndex with -fix and then the error was gone. The >>>>> segment was alright >>>>> >>>>> >>>>> During checkIndex I do not shut down the solr server, I just make sure >>>>> no >>>>> client connect to the server. >>>>> >>>>> Should I shut down the solr server during checkIndex ? >>>>> >>>>> >>>>> >>>>> first checkIndex : >>>>> >>>>> 4 of 17: name=_phe docCount=264148 >>>>> compound=false >>>>> hasProx=true >>>>> numFiles=9 >>>>> size (MB)=928.977 >>>>> diagnostics = {optimize=false, mergeFactor=10, >>>>> os.version=2.6.26-2-amd64, >>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 >>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20, >>>>> java.vendor=Sun Microsystems Inc.} >>>>> has deletions [delFileName=_phe_p3.del] >>>>> test: open reader.........OK [44824 deleted docs] >>>>> test: fields..............OK [51 fields] >>>>> test: field norms.........OK [51 fields] >>>>> test: terms, freq, prox...ERROR [term post_id:562 docFreq=1 != num >>>>> docs >>>>> seen 0 + num docs deleted 0] >>>>> java.lang.RuntimeException: term post_id:562 docFreq=1 != num docs seen >>>>> 0 + >>>>> num docs deleted 0 >>>>> at >>>>> org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) >>>>> at >>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) >>>>> at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >>>>> test: stored fields.......OK [7206878 total field count; avg 32.86 >>>>> fields >>>>> per doc] >>>>> test: term vectors........OK [0 total vector count; avg 0 term/freq >>>>> vector fields per doc] >>>>> FAILED >>>>> WARNING: fixIndex() would remove reference to this segment; full >>>>> exception: >>>>> java.lang.RuntimeException: Term Index test failed >>>>> at >>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) >>>>> at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >>>>> >>>>> >>>>> a few minutes latter : >>>>> >>>>> 4 of 18: name=_phe docCount=264148 >>>>> compound=false >>>>> hasProx=true >>>>> numFiles=9 >>>>> size (MB)=928.977 >>>>> diagnostics = {optimize=false, mergeFactor=10, >>>>> os.version=2.6.26-2-amd64, >>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06 >>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0 >>>>> _20, java.vendor=Sun Microsystems Inc.} >>>>> has deletions [delFileName=_phe_p4.del] >>>>> test: open reader.........OK [44828 deleted docs] >>>>> test: fields..............OK [51 fields] >>>>> test: field norms.........OK [51 fields] >>>>> test: terms, freq, prox...OK [3200899 terms; 26804334 terms/docs >>>>> pairs; >>>>> 28919124 tokens] >>>>> test: stored fields.......OK [7206764 total field count; avg 32.86 >>>>> fields >>>>> per doc] >>>>> test: term vectors........OK [0 total vector count; avg 0 term/freq >>>>> vector fields per doc] >>>>> >>>>> >>>>> Le 12/01/2011 16:50, Michael McCandless a écrit : >>>>>> >>>>>> Curious... is it always a docFreq=1 != num docs seen 0 + num docs >>>>>> deleted >>>>>> 0? >>>>>> >>>>>> It looks like new deletions were flushed against the segment (del file >>>>>> changed from _ncc_22s.del to _ncc_24f.del). >>>>>> >>>>>> Are you hitting any exceptions during indexing? >>>>>> >>>>>> Mike >>>>>> >>>>>> On Wed, Jan 12, 2011 at 10:33 AM, Stéphane Delprat >>>>>> <stephane.delp...@blogspirit.com> wrote: >>>>>>> >>>>>>> I got another corruption. >>>>>>> >>>>>>> It sure looks like it's the same type of error. (on a different >>>>>>> field) >>>>>>> >>>>>>> It's also not linked to a merge, since the segment size did not >>>>>>> change. >>>>>>> >>>>>>> >>>>>>> *** good segment : >>>>>>> >>>>>>> 1 of 9: name=_ncc docCount=1841685 >>>>>>> compound=false >>>>>>> hasProx=true >>>>>>> numFiles=9 >>>>>>> size (MB)=6,683.447 >>>>>>> diagnostics = {optimize=false, mergeFactor=10, >>>>>>> os.version=2.6.26-2-amd64, >>>>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - >>>>>>> 2010-06-06 >>>>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0 >>>>>>> _20, java.vendor=Sun Microsystems Inc.} >>>>>>> has deletions [delFileName=_ncc_22s.del] >>>>>>> test: open reader.........OK [275881 deleted docs] >>>>>>> test: fields..............OK [51 fields] >>>>>>> test: field norms.........OK [51 fields] >>>>>>> test: terms, freq, prox...OK [17952652 terms; 174113812 terms/docs >>>>>>> pairs; >>>>>>> 204561440 tokens] >>>>>>> test: stored fields.......OK [45511958 total field count; avg >>>>>>> 29.066 >>>>>>> fields per doc] >>>>>>> test: term vectors........OK [0 total vector count; avg 0 >>>>>>> term/freq >>>>>>> vector fields per doc] >>>>>>> >>>>>>> >>>>>>> a few hours latter : >>>>>>> >>>>>>> *** broken segment : >>>>>>> >>>>>>> 1 of 17: name=_ncc docCount=1841685 >>>>>>> compound=false >>>>>>> hasProx=true >>>>>>> numFiles=9 >>>>>>> size (MB)=6,683.447 >>>>>>> diagnostics = {optimize=false, mergeFactor=10, >>>>>>> os.version=2.6.26-2-amd64, >>>>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - >>>>>>> 2010-06-06 >>>>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0 >>>>>>> _20, java.vendor=Sun Microsystems Inc.} >>>>>>> has deletions [delFileName=_ncc_24f.del] >>>>>>> test: open reader.........OK [278167 deleted docs] >>>>>>> test: fields..............OK [51 fields] >>>>>>> test: field norms.........OK [51 fields] >>>>>>> test: terms, freq, prox...ERROR [term post_id:1599104 docFreq=1 != >>>>>>> num >>>>>>> docs seen 0 + num docs deleted 0] >>>>>>> java.lang.RuntimeException: term post_id:1599104 docFreq=1 != num >>>>>>> docs >>>>>>> seen >>>>>>> 0 + num docs deleted 0 >>>>>>> at >>>>>>> org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) >>>>>>> at >>>>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) >>>>>>> at >>>>>>> org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >>>>>>> test: stored fields.......OK [45429565 total field count; avg >>>>>>> 29.056 >>>>>>> fields per doc] >>>>>>> test: term vectors........OK [0 total vector count; avg 0 >>>>>>> term/freq >>>>>>> vector fields per doc] >>>>>>> FAILED >>>>>>> WARNING: fixIndex() would remove reference to this segment; full >>>>>>> exception: >>>>>>> java.lang.RuntimeException: Term Index test failed >>>>>>> at >>>>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) >>>>>>> at >>>>>>> org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >>>>>>> >>>>>>> >>>>>>> I'll activate infoStream for next time. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>>> Le 12/01/2011 00:49, Michael McCandless a écrit : >>>>>>>> >>>>>>>> When you hit corruption is it always this same problem?: >>>>>>>> >>>>>>>> java.lang.RuntimeException: term source:margolisphil docFreq=1 != >>>>>>>> num docs seen 0 + num docs deleted 0 >>>>>>>> >>>>>>>> Can you run with Lucene's IndexWriter infoStream turned on, and >>>>>>>> catch >>>>>>>> the output leading to the corruption? If something is somehow >>>>>>>> messing >>>>>>>> up the bits in the deletes file that could cause this. >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> On Mon, Jan 10, 2011 at 5:52 AM, Stéphane Delprat >>>>>>>> <stephane.delp...@blogspirit.com> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> We are using : >>>>>>>>> Solr Specification Version: 1.4.1 >>>>>>>>> Solr Implementation Version: 1.4.1 955763M - mark - 2010-06-17 >>>>>>>>> 18:06:42 >>>>>>>>> Lucene Specification Version: 2.9.3 >>>>>>>>> Lucene Implementation Version: 2.9.3 951790 - 2010-06-06 01:30:55 >>>>>>>>> >>>>>>>>> # java -version >>>>>>>>> java version "1.6.0_20" >>>>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02) >>>>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) >>>>>>>>> >>>>>>>>> We want to index 4M docs in one core (and when it works fine we >>>>>>>>> will >>>>>>>>> add >>>>>>>>> other cores with 2M on the same server) (1 doc ~= 1kB) >>>>>>>>> >>>>>>>>> We use SOLR replication every 5 minutes to update the slave server >>>>>>>>> (queries >>>>>>>>> are executed on the slave only) >>>>>>>>> >>>>>>>>> Documents are changing very quickly, during a normal day we will >>>>>>>>> have >>>>>>>>> approx >>>>>>>>> : >>>>>>>>> * 200 000 updated docs >>>>>>>>> * 1000 new docs >>>>>>>>> * 200 deleted docs >>>>>>>>> >>>>>>>>> >>>>>>>>> I attached the last good checkIndex : solr20110107.txt >>>>>>>>> And the corrupted one : solr20110110.txt >>>>>>>>> >>>>>>>>> >>>>>>>>> This is not the first time a segment gets corrupted on this server, >>>>>>>>> that's >>>>>>>>> why I ran frequent "checkIndex". (but as you can see the first >>>>>>>>> segment >>>>>>>>> is >>>>>>>>> 1.800.000 docs and it works fine!) >>>>>>>>> >>>>>>>>> >>>>>>>>> I can't find any "SEVER" "FATAL" or "exception" in the Solr logs. >>>>>>>>> >>>>>>>>> >>>>>>>>> I also attached my schema.xml and solrconfig.xml >>>>>>>>> >>>>>>>>> >>>>>>>>> Is there something wrong with what we are doing ? Do you need other >>>>>>>>> info >>>>>>>>> ? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> >>> -- >>> Lance Norskog >>> goks...@gmail.com >>> >> >