Without the optimize, it looks like there are errors on all segments except the first:
Opening index @ D:\mnsavs\lresumes1\lresumes1.luc\lresumes1.search.main.2 Segments file=segments_2 numSegments=3 version=FORMAT_DIAGNOSTICS [Lucene 2.9] 1 of 3: name=_0 docCount=413557 compound=false hasProx=true numFiles=8 size (MB)=1,148.795 diagnostics = {os.version=5.2, os=Windows 2003, lucene.version=2.9 exported - 2009-10-27 15:31:52, source=flush, os.arch=amd64, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.} docStoreOffset=0 docStoreSegment=_0 docStoreIsCompoundFile=false no deletions test: open reader.........OK test: fields..............OK [33 fields] test: field norms.........OK [33 fields] test: terms, freq, prox...OK [7704599 terms; 180318249 terms/docs pairs; 340258711 tokens] test: stored fields.......OK [1240671 total field count; avg 3 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 2 of 3: name=_1 docCount=359203 compound=false hasProx=true numFiles=8 size (MB)=1,125.103 diagnostics = {os.version=5.2, os=Windows 2003, lucene.version=2.9 exported - 2009-10-27 15:31:52, source=flush, os.arch=amd64, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.} docStoreOffset=413557 docStoreSegment=_0 docStoreIsCompoundFile=false no deletions test: open reader.........OK test: fields..............OK [33 fields] test: field norms.........OK [33 fields] test: terms, freq, prox...ERROR [term literals:cfid196$ docFreq=43 != num docs seen 4 + num docs deleted 0] java.lang.RuntimeException: term literals:cfid196$ docFreq=43 != num docs seen 4 + num docs deleted 0 at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) test: stored fields.......OK [1077609 total field count; avg 3 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) 3 of 3: name=_2 docCount=304659 compound=false hasProx=true numFiles=8 size (MB)=961.764 diagnostics = {os.version=5.2, os=Windows 2003, lucene.version=2.9 exported - 2009-10-27 15:31:52, source=flush, os.arch=amd64, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.} docStoreOffset=772760 docStoreSegment=_0 docStoreIsCompoundFile=false no deletions test: open reader.........OK test: fields..............OK [33 fields] test: field norms.........OK [33 fields] test: terms, freq, prox...ERROR [term contents:? docFreq=1 != num docs seen 245 + num docs deleted 0] java.lang.RuntimeException: term contents:? docFreq=1 != num docs seen 245 + num docs deleted 0 at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) test: stored fields.......OK [913977 total field count; avg 3 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) WARNING: 2 broken segments (containing 663862 documents) detected WARNING: would write new segments file, and 663862 documents would be lost, if -fix were specified Do the unit tests create multi-segment indexes? Peter On Tue, Oct 27, 2009 at 3:08 PM, Peter Keegan <peterlkee...@gmail.com>wrote: > It's reproducible with a large no. of docs (>1 million), but not with 100K > docs. > I got same error with jvm 1.6.0_16. > The index was optimized after all docs are added. I'll try removing the > optimize. > > Peter > > > On Tue, Oct 27, 2009 at 2:57 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> This is odd -- is it reproducible? >> >> Can you narrow it down to a small set of docs that when indexed >> produce a corrupted index? >> >> If you attempt to optimize the index, does it fail? >> >> Mike >> >> On Tue, Oct 27, 2009 at 1:40 PM, Peter Keegan <peterlkee...@gmail.com> >> wrote: >> > It seems the index is corrupted immediately after the initial build >> (ample >> > disk space was provided): >> > >> > Output from CheckIndex: >> > >> > Opening index @ >> D:\mnsavs\lresumes1\lresumes1.luc\lresumes1.search.main.2 >> > >> > Segments file=segments_3 numSegments=1 version=FORMAT_DIAGNOSTICS >> [Lucene >> > 2.9] >> > 1 of 1: name=_7 docCount=1077025 >> > compound=false >> > hasProx=true >> > numFiles=8 >> > size (MB)=3,201.196 >> > diagnostics = {optimize=true, mergeFactor=7, os.version=5.2, >> os=Windows >> > 2003, mergeDocStores=false, lucene.version=2.9 exported - 2009-10-26 >> > 07:58:55, source=merge, os.arch=amd64, java.version=1.6.0_03, >> > java.vendor=Sun Microsystems Inc.} >> > docStoreOffset=0 >> > docStoreSegment=_0 >> > docStoreIsCompoundFile=false >> > no deletions >> > test: open reader.........OK >> > test: fields..............OK [33 fields] >> > test: field norms.........OK [33 fields] >> > test: terms, freq, prox...ERROR [term contents:? docFreq=1 != num >> docs >> > seen 482 + num docs deleted 0] >> > java.lang.RuntimeException: term contents:? docFreq=1 != num docs seen >> 482 + >> > num docs deleted 0 >> > at >> org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675) >> > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) >> > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >> > test: stored fields.......OK [3231075 total field count; avg 3 fields >> > per doc] >> > test: term vectors........OK [0 total vector count; avg 0 term/freq >> > vector fields per doc] >> > FAILED >> > WARNING: fixIndex() would remove reference to this segment; full >> > exception: >> > java.lang.RuntimeException: Term Index test failed >> > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) >> > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903) >> > >> > WARNING: 1 broken segments (containing 1077025 documents) detected >> > WARNING: would write new segments file, and 1077025 documents would be >> lost, >> > if -fix were specified >> > >> > Searching on this index seems to be fine, though. >> > >> > Here is the IndexWriter log from the build: >> > >> > IFD [Indexer]: setInfoStream >> > >> deletionpolicy=org.apache.lucene.index.keeponlylastcommitdeletionpol...@2a9cfec1 >> > IW 0 [Indexer]: setInfoStream: >> > dir=org.apache.lucene.store.simplefsdirect...@d >> :\mnsavs\lresumes1\lresumes1.luc\lresumes1.update.main.2 >> > autoCommit=false >> > >> mergepolicy=org.apache.lucene.index.logbytesizemergepol...@291946c2mergescheduler >> =org.apache.lucene.index.concurrentmergeschedu...@3a747fa2rambuffersizemb >> =16.0 >> > maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 >> > maxFieldLength=2147483647 index= >> > IW 0 [Indexer]: setRAMBufferSizeMB 910.25 >> > IW 0 [Indexer]: setMaxBufferedDocs 1000000 >> > IW 0 [Indexer]: flush at getReader >> > IW 0 [Indexer]: flush: segment=null docStoreSegment=null >> docStoreOffset=0 >> > flushDocs=false flushDeletes=true flushDocStores=false numDocs=0 >> > numBufDelTerms=0 >> > IW 0 [Indexer]: index before flush >> > IW 0 [UpdWriterBuild]: DW: RAM: now flush @ usedMB=886.463 >> allocMB=886.463 >> > deletesMB=23.803 triggerMB=910.25 >> > IW 0 [UpdWriterBuild]: flush: segment=_0 docStoreSegment=_0 >> > docStoreOffset=0 flushDocs=true flushDeletes=false flushDocStores=false >> > numDocs=171638 numBufDelTerms=171638 >> > IW 0 [UpdWriterBuild]: index before flush >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _0 numDocs=171638 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=929523712 >> newFlushedSize=573198529 >> > docs/MB=313.985 new/old=61.666% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [1 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [1 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 1 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 1 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [UpdWriterBuild]: DW: RAM: now flush @ usedMB=857.977 >> allocMB=901.32 >> > deletesMB=52.274 triggerMB=910.25 >> > IW 0 [UpdWriterBuild]: flush: segment=_1 docStoreSegment=_0 >> > docStoreOffset=171638 flushDocs=true flushDeletes=false >> flushDocStores=false >> > numDocs=204995 numBufDelTerms=204995 >> > IW 0 [UpdWriterBuild]: index before flush _0:C171638->_0 >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _1 numDocs=204995 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=899653632 >> newFlushedSize=544283851 >> > docs/MB=394.928 new/old=60.499% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [2 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [2 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 2 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 2 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [UpdWriterBuild]: DW: RAM: now balance allocations: >> usedMB=834.645 vs >> > trigger=910.25 allocMB=901.32 deletesMB=75.627 vs trigger=955.762 >> > byteBlockFree=35.938 charBlockFree=8.938 >> > IW 0 [UpdWriterBuild]: DW: nothing to free >> > IW 0 [UpdWriterBuild]: DW: after free: freedMB=66.613 usedMB=910.272 >> > allocMB=834.707 >> > IW 0 [UpdWriterBuild]: flush: segment=_2 docStoreSegment=_0 >> > docStoreOffset=376633 flushDocs=true flushDeletes=false >> flushDocStores=false >> > numDocs=168236 numBufDelTerms=168236 >> > IW 0 [UpdWriterBuild]: index before flush _0:C171638->_0 >> _1:C204995->_0 >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _2 numDocs=168236 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=875188224 >> newFlushedSize=530720464 >> > docs/MB=332.394 new/old=60.641% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [3 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [3 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 3 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 3 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [UpdWriterBuild]: DW: RAM: now flush @ usedMB=814.282 >> allocMB=835.832 >> > deletesMB=95.997 triggerMB=910.25 >> > IW 0 [UpdWriterBuild]: flush: segment=_3 docStoreSegment=_0 >> > docStoreOffset=544869 flushDocs=true flushDeletes=false >> flushDocStores=false >> > numDocs=146894 numBufDelTerms=146894 >> > IW 0 [UpdWriterBuild]: index before flush _0:C171638->_0 >> _1:C204995->_0 >> > _2:C168236->_0 >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _3 numDocs=146894 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=853836800 >> newFlushedSize=522388771 >> > docs/MB=294.856 new/old=61.181% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [4 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [4 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 4 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 4 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [UpdWriterBuild]: DW: RAM: now flush @ usedMB=791.724 >> allocMB=835.832 >> > deletesMB=118.535 triggerMB=910.25 >> > IW 0 [UpdWriterBuild]: flush: segment=_4 docStoreSegment=_0 >> > docStoreOffset=691763 flushDocs=true flushDeletes=false >> flushDocStores=false >> > numDocs=162034 numBufDelTerms=162034 >> > IW 0 [UpdWriterBuild]: index before flush _0:C171638->_0 >> _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _4 numDocs=162034 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=830182400 >> newFlushedSize=498741034 >> > docs/MB=340.668 new/old=60.076% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [5 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [5 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 5 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 5 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [UpdWriterBuild]: DW: RAM: now balance allocations: >> usedMB=771.396 vs >> > trigger=910.25 allocMB=835.832 deletesMB=138.875 vs trigger=955.762 >> > byteBlockFree=39.688 charBlockFree=7.188 >> > IW 0 [UpdWriterBuild]: DW: nothing to free >> > IW 0 [UpdWriterBuild]: DW: after free: freedMB=64.374 usedMB=910.271 >> > allocMB=771.458 >> > IW 0 [UpdWriterBuild]: flush: segment=_5 docStoreSegment=_0 >> > docStoreOffset=853797 flushDocs=true flushDeletes=false >> flushDocStores=false >> > numDocs=146250 numBufDelTerms=146250 >> > IW 0 [UpdWriterBuild]: index before flush _0:C171638->_0 >> _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 >> > IW 0 [UpdWriterBuild]: DW: flush postings as segment _5 numDocs=146250 >> > IW 0 [UpdWriterBuild]: DW: oldRAMSize=808866816 >> newFlushedSize=485212402 >> > docs/MB=316.056 new/old=59.987% >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [6 segments ; isCommit >> = >> > false] >> > IFD [UpdWriterBuild]: now checkpoint "segments_1" [6 segments ; isCommit >> = >> > false] >> > IW 0 [UpdWriterBuild]: LMP: findMerges: 6 segments >> > IW 0 [UpdWriterBuild]: LMP: level 8.008305 to 8.758305: 6 segments >> > IW 0 [UpdWriterBuild]: CMS: now merge >> > IW 0 [UpdWriterBuild]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> > IW 0 [UpdWriterBuild]: CMS: no more merges pending; now return >> > IW 0 [Indexer]: commit: start >> > IW 0 [Indexer]: commit: now prepare >> > IW 0 [Indexer]: prepareCommit: flush >> > IW 0 [Indexer]: flush: segment=_6 docStoreSegment=_0 >> > docStoreOffset=1000047 flushDocs=true flushDeletes=true >> flushDocStores=true >> > numDocs=76978 numBufDelTerms=76978 >> > IW 0 [Indexer]: index before flush _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> > IW 0 [Indexer]: flush shared docStore segment _0 >> > IW 0 [Indexer]: DW: closeDocStore: 2 files to flush to segment _0 >> > numDocs=1077025 >> > IW 0 [Indexer]: DW: flush postings as segment _6 numDocs=76978 >> > IW 0 [Indexer]: DW: oldRAMSize=486968320 newFlushedSize=273168136 >> > docs/MB=295.486 new/old=56.096% >> > IFD [Indexer]: now checkpoint "segments_1" [7 segments ; isCommit = >> false] >> > IW 0 [Indexer]: DW: apply 1077025 buffered deleted terms and 0 deleted >> > docIDs and 0 deleted queries on 7 segments. >> > IFD [Indexer]: now checkpoint "segments_1" [7 segments ; isCommit = >> false] >> > IW 0 [Indexer]: LMP: findMerges: 7 segments >> > IW 0 [Indexer]: LMP: level 8.008305 to 8.758305: 7 segments >> > IW 0 [Indexer]: CMS: now merge >> > IW 0 [Indexer]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> _2:C168236->_0 >> > _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 _6:C76978->_0 >> > IW 0 [Indexer]: CMS: no more merges pending; now return >> > IW 0 [Indexer]: startCommit(): start sizeInBytes=0 >> > IW 0 [Indexer]: startCommit index=_0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > changeCount=21 >> > IW 0 [Indexer]: now sync _0.tis >> > IW 0 [Indexer]: now sync _5.prx >> > IW 0 [Indexer]: now sync _3.frq >> > IW 0 [Indexer]: now sync _3.tii >> > IW 0 [Indexer]: now sync _1.frq >> > IW 0 [Indexer]: now sync _6.frq >> > IW 0 [Indexer]: now sync _4.prx >> > IW 0 [Indexer]: now sync _4.fnm >> > IW 0 [Indexer]: now sync _2.tii >> > IW 0 [Indexer]: now sync _3.fnm >> > IW 0 [Indexer]: now sync _1.fnm >> > IW 0 [Indexer]: now sync _6.tis >> > IW 0 [Indexer]: now sync _4.frq >> > IW 0 [Indexer]: now sync _5.nrm >> > IW 0 [Indexer]: now sync _5.tis >> > IW 0 [Indexer]: now sync _1.tii >> > IW 0 [Indexer]: now sync _4.tis >> > IW 0 [Indexer]: now sync _0.prx >> > IW 0 [Indexer]: now sync _3.nrm >> > IW 0 [Indexer]: now sync _4.tii >> > IW 0 [Indexer]: now sync _0.nrm >> > IW 0 [Indexer]: now sync _5.fnm >> > IW 0 [Indexer]: now sync _1.tis >> > IW 0 [Indexer]: now sync _0.fnm >> > IW 0 [Indexer]: now sync _2.prx >> > IW 0 [Indexer]: now sync _6.tii >> > IW 0 [Indexer]: now sync _4.nrm >> > IW 0 [Indexer]: now sync _2.frq >> > IW 0 [Indexer]: now sync _5.frq >> > IW 0 [Indexer]: now sync _3.prx >> > IW 0 [Indexer]: now sync _5.tii >> > IW 0 [Indexer]: now sync _2.fnm >> > IW 0 [Indexer]: now sync _1.prx >> > IW 0 [Indexer]: now sync _2.tis >> > IW 0 [Indexer]: now sync _0.tii >> > IW 0 [Indexer]: now sync _6.prx >> > IW 0 [Indexer]: now sync _0.frq >> > IW 0 [Indexer]: now sync _6.fnm >> > IW 0 [Indexer]: now sync _0.fdx >> > IW 0 [Indexer]: now sync _6.nrm >> > IW 0 [Indexer]: now sync _0.fdt >> > IW 0 [Indexer]: now sync _1.nrm >> > IW 0 [Indexer]: now sync _2.nrm >> > IW 0 [Indexer]: now sync _3.tis >> > IW 0 [Indexer]: done all syncs >> > IW 0 [Indexer]: commit: pendingCommit != null >> > IW 0 [Indexer]: commit: wrote segments file "segments_2" >> > IFD [Indexer]: now checkpoint "segments_2" [7 segments ; isCommit = >> true] >> > IFD [Indexer]: deleteCommits: now decRef commit "segments_1" >> > IFD [Indexer]: delete "segments_1" >> > IW 0 [Indexer]: commit: done >> > IW 0 [Indexer]: optimize: index now _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > IW 0 [Indexer]: flush: segment=null docStoreSegment=_6 >> docStoreOffset=0 >> > flushDocs=false flushDeletes=true flushDocStores=false numDocs=0 >> > numBufDelTerms=0 >> > IW 0 [Indexer]: index before flush _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > IW 0 [Indexer]: add merge to pendingMerges: _0:C171638->_0 >> _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > [optimize] [total 1 pending] >> > IW 0 [Indexer]: CMS: now merge >> > IW 0 [Indexer]: CMS: index: _0:C171638->_0 _1:C204995->_0 >> _2:C168236->_0 >> > _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 _6:C76978->_0 >> > IW 0 [Indexer]: CMS: consider merge _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > into _7 [optimize] >> > IW 0 [Indexer]: CMS: launch new thread [Lucene Merge Thread #0] >> > IW 0 [Indexer]: CMS: no more merges pending; now return >> > IW 0 [Lucene Merge Thread #0]: CMS: merge thread: start >> > IW 0 [Lucene Merge Thread #0]: now merge >> > merge=_0:C171638->_0 _1:C204995->_0 _2:C168236->_0 _3:C146894->_0 >> > _4:C162034->_0 _5:C146250->_0 _6:C76978->_0 into _7 [optimize] >> > merge=org.apache.lucene.index.mergepolicy$oneme...@78688954 >> > index=_0:C171638->_0 _1:C204995->_0 _2:C168236->_0 _3:C146894->_0 >> > _4:C162034->_0 _5:C146250->_0 _6:C76978->_0 >> > IW 0 [Lucene Merge Thread #0]: merging _0:C171638->_0 _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > into _7 [optimize] >> > IW 0 [Lucene Merge Thread #0]: merge: total 1077025 docs >> > IW 0 [Lucene Merge Thread #0]: commitMerge: _0:C171638->_0 >> _1:C204995->_0 >> > _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 >> _6:C76978->_0 >> > into _7 [optimize] index=_0:C171638->_0 _1:C204995->_0 _2:C168236->_0 >> > _3:C146894->_0 _4:C162034->_0 _5:C146250->_0 _6:C76978->_0 >> > IW 0 [Lucene Merge Thread #0]: commitMergeDeletes _0:C171638->_0 >> > _1:C204995->_0 _2:C168236->_0 _3:C146894->_0 _4:C162034->_0 >> _5:C146250->_0 >> > _6:C76978->_0 into _7 [optimize] >> > IFD [Lucene Merge Thread #0]: now checkpoint "segments_2" [1 segments ; >> > isCommit = false] >> > IW 0 [Lucene Merge Thread #0]: CMS: merge thread: done >> > IW 0 [Indexer]: now flush at close >> > IW 0 [Indexer]: flush: segment=null docStoreSegment=_6 >> docStoreOffset=0 >> > flushDocs=false flushDeletes=true flushDocStores=true numDocs=0 >> > numBufDelTerms=0 >> > IW 0 [Indexer]: index before flush _7:C1077025->_0 >> > IW 0 [Indexer]: flush shared docStore segment _6 >> > IW 0 [Indexer]: DW: closeDocStore: 0 files to flush to segment _6 >> numDocs=0 >> > IW 0 [Indexer]: CMS: now merge >> > IW 0 [Indexer]: CMS: index: _7:C1077025->_0 >> > IW 0 [Indexer]: CMS: no more merges pending; now return >> > IW 0 [Indexer]: now call final commit() >> > IW 0 [Indexer]: startCommit(): start sizeInBytes=0 >> > IW 0 [Indexer]: startCommit index=_7:C1077025->_0 changeCount=23 >> > IW 0 [Indexer]: now sync _7.prx >> > IW 0 [Indexer]: now sync _7.fnm >> > IW 0 [Indexer]: now sync _7.tis >> > IW 0 [Indexer]: now sync _7.nrm >> > IW 0 [Indexer]: now sync _7.tii >> > IW 0 [Indexer]: now sync _7.frq >> > IW 0 [Indexer]: done all syncs >> > IW 0 [Indexer]: commit: pendingCommit != null >> > IW 0 [Indexer]: commit: wrote segments file "segments_3" >> > IFD [Indexer]: now checkpoint "segments_3" [1 segments ; isCommit = >> true] >> > IFD [Indexer]: deleteCommits: now decRef commit "segments_2" >> > IFD [Indexer]: delete "_0.tis" >> > IFD [Indexer]: delete "_5.prx" >> > IFD [Indexer]: delete "_3.tii" >> > IFD [Indexer]: delete "_3.frq" >> > IFD [Indexer]: delete "_1.frq" >> > IFD [Indexer]: delete "_6.frq" >> > IFD [Indexer]: delete "_4.prx" >> > IFD [Indexer]: delete "_4.fnm" >> > IFD [Indexer]: delete "_2.tii" >> > IFD [Indexer]: delete "_3.fnm" >> > IFD [Indexer]: delete "_1.fnm" >> > IFD [Indexer]: delete "_6.tis" >> > IFD [Indexer]: delete "_4.frq" >> > IFD [Indexer]: delete "_5.nrm" >> > IFD [Indexer]: delete "_5.tis" >> > IFD [Indexer]: delete "_1.tii" >> > IFD [Indexer]: delete "_4.tis" >> > IFD [Indexer]: delete "_0.prx" >> > IFD [Indexer]: delete "_3.nrm" >> > IFD [Indexer]: delete "_4.tii" >> > IFD [Indexer]: delete "_0.nrm" >> > IFD [Indexer]: delete "_5.fnm" >> > IFD [Indexer]: delete "_1.tis" >> > IFD [Indexer]: delete "_0.fnm" >> > IFD [Indexer]: delete "_2.prx" >> > IFD [Indexer]: delete "_6.tii" >> > IFD [Indexer]: delete "_4.nrm" >> > IFD [Indexer]: delete "_2.frq" >> > IFD [Indexer]: delete "_5.frq" >> > IFD [Indexer]: delete "_3.prx" >> > IFD [Indexer]: delete "_5.tii" >> > IFD [Indexer]: delete "_2.fnm" >> > IFD [Indexer]: delete "_1.prx" >> > IFD [Indexer]: delete "_2.tis" >> > IFD [Indexer]: delete "_0.tii" >> > IFD [Indexer]: delete "_6.prx" >> > IFD [Indexer]: delete "_0.frq" >> > IFD [Indexer]: delete "segments_2" >> > IFD [Indexer]: delete "_6.fnm" >> > IFD [Indexer]: delete "_6.nrm" >> > IFD [Indexer]: delete "_1.nrm" >> > IFD [Indexer]: delete "_2.nrm" >> > IFD [Indexer]: delete "_3.tis" >> > IW 0 [Indexer]: commit: done >> > IW 0 [Indexer]: at close: _7:C1077025->_0 >> > >> > I see no errors. >> > Peter >> > >> > >> > On Tue, Oct 27, 2009 at 10:44 AM, Peter Keegan <peterlkee...@gmail.com >> >wrote: >> > >> >> >> >> >> >> On Tue, Oct 27, 2009 at 10:37 AM, Michael McCandless < >> >> luc...@mikemccandless.com> wrote: >> >> >> >>> OK that exception looks more reasonable, for a disk full event. >> >>> >> >>> But, I can't tell from your followon emails: did this lead to index >> >>> corruption? >> >>> >> >> >> >> Yes, but this may be caused by the application ignoring a Lucene >> exception >> >> somewhere else. I will chase this down. >> >> >> >>> >> >>> Also, I noticed you're using a rather old 1.6.0 JRE (1.6.0_03) -- you >> >>> really should upgrade that to the latest 1.6.0 -- there's at least one >> >>> known problem with Lucene and early 1.6.0 JREs. >> >>> >> >> >> >> Yes, I remember this problem - that's why we stayed at _03 >> >> Thanks. >> >> >> >>> >> >>> Mike >> >>> >> >>> On Tue, Oct 27, 2009 at 10:00 AM, Peter Keegan < >> peterlkee...@gmail.com> >> >>> wrote: >> >>> > After rebuilding the corrupted indexes, the low disk space exception >> is >> >>> now >> >>> > occurring as expected. Sorry for the distraction. >> >>> > >> >>> > fyi, here are the details: >> >>> > >> >>> > java.io.IOException: There is not enough space on the disk >> >>> > at java.io.RandomAccessFile.writeBytes(Native Method) >> >>> > at java.io.RandomAccessFile.write(Unknown Source) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.close(SimpleFSDirectory.java:199) >> >>> > at >> org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:144) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:357) >> >>> > at >> >>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153) >> >>> > at >> >>> > >> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5011) >> >>> > at >> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4596) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.index.IndexWriter.resolveExternalSegments(IndexWriter.java:3786) >> >>> > at >> >>> > >> >>> >> org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:3695) >> >>> > >> >>> > >> >>> > And the corresponding index info log: >> >>> > >> >>> > IFD [Indexer]: setInfoStream >> >>> > >> >>> >> deletionpolicy=org.apache.lucene.index.keeponlylastcommitdeletionpol...@256ef705 >> >>> > IW 1 [Indexer]: setInfoStream: >> >>> > dir=org.apache.lucene.store.simplefsdirect...@d >> >>> :\mnsavs\lresumes3\lresumes3.luc\lresumes3.update.main.4 >> >>> > autoCommit=false >> >>> > >> >>> >> mergepolicy=org.apache.lucene.index.logbytesizemergepol...@181b7c76mergescheduler >> >>> >> =org.apache.lucene.index.concurrentmergeschedu...@34883357rambuffersizemb >> >>> =16.0 >> >>> > maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 >> >>> > maxFieldLength=2147483647 index= >> >>> > IW 1 [Indexer]: flush at addIndexesNoOptimize >> >>> > IW 1 [Indexer]: flush: segment=null docStoreSegment=null >> >>> docStoreOffset=0 >> >>> > flushDocs=false flushDeletes=true flushDocStores=false numDocs=0 >> >>> > numBufDelTerms=0 >> >>> > IW 1 [Indexer]: index before flush >> >>> > IW 1 [Indexer]: now start transaction >> >>> > IW 1 [Indexer]: LMP: findMerges: 2 segments >> >>> > IW 1 [Indexer]: LMP: level 8.774518 to 9.524518: 1 segments >> >>> > IW 1 [Indexer]: LMP: level 6.2973914 to 7.0473914: 1 segments >> >>> > IW 1 [Indexer]: CMS: now merge >> >>> > IW 1 [Indexer]: CMS: index: _7:Cx1075533->_0** _8:Cx2795** >> >>> > IW 1 [Indexer]: CMS: no more merges pending; now return >> >>> > IW 1 [Indexer]: add merge to pendingMerges: _7:Cx1075533->_0 [total >> 1 >> >>> > pending] >> >>> > IW 1 [Indexer]: now merge >> >>> > merge=_7:Cx1075533->_0 into _0 [mergeDocStores] >> >>> > merge=org.apache.lucene.index.mergepolicy$oneme...@4d480ea >> >>> > index=_7:Cx1075533->_0** _8:Cx2795** >> >>> > IW 1 [Indexer]: merging _7:Cx1075533->_0 into _0 [mergeDocStores] >> >>> > IW 1 [Indexer]: merge: total 1074388 docs >> >>> > IW 1 [Indexer]: commitMerge: _7:Cx1075533->_0 into _0 >> [mergeDocStores] >> >>> > index=_7:Cx1075533->_0** _8:Cx2795** >> >>> > IW 1 [Indexer]: commitMergeDeletes _7:Cx1075533->_0 into _0 >> >>> [mergeDocStores] >> >>> > IFD [Indexer]: now checkpoint "segments_1" [2 segments ; isCommit = >> >>> false] >> >>> > IW 1 [Indexer]: LMP: findMerges: 2 segments >> >>> > IW 1 [Indexer]: LMP: level 8.864886 to 9.614886: 1 segments >> >>> > IW 1 [Indexer]: LMP: level 6.2973914 to 7.0473914: 1 segments >> >>> > IW 1 [Indexer]: add merge to pendingMerges: _8:Cx2795 [total 1 >> pending] >> >>> > IW 1 [Indexer]: now merge >> >>> > merge=_8:Cx2795 into _1 [mergeDocStores] >> >>> > merge=org.apache.lucene.index.mergepolicy$oneme...@606f8b2b >> >>> > index=_0:C1074388 _8:Cx2795** >> >>> > IW 1 [Indexer]: merging _8:Cx2795 into _1 [mergeDocStores] >> >>> > IW 1 [Indexer]: merge: total 2795 docs >> >>> > IW 1 [Indexer]: handleMergeException: merge=_8:Cx2795 into _1 >> >>> > [mergeDocStores] exc=java.io.IOException: There is not enough space >> on >> >>> the >> >>> > disk >> >>> > IW 1 [Indexer]: hit exception during merge >> >>> > IFD [Indexer]: refresh [prefix=_1]: removing newly created >> unreferenced >> >>> file >> >>> > "_1.fdt" >> >>> > IFD [Indexer]: delete "_1.fdt" >> >>> > IFD [Indexer]: refresh [prefix=_1]: removing newly created >> unreferenced >> >>> file >> >>> > "_1.fdx" >> >>> > IFD [Indexer]: delete "_1.fdx" >> >>> > IFD [Indexer]: refresh [prefix=_1]: removing newly created >> unreferenced >> >>> file >> >>> > "_1.fnm" >> >>> > IFD [Indexer]: delete "_1.fnm" >> >>> > IW 1 [Indexer]: now rollback transaction >> >>> > IW 1 [Indexer]: all running merges have aborted >> >>> > IFD [Indexer]: now checkpoint "segments_1" [0 segments ; isCommit = >> >>> false] >> >>> > IFD [Indexer]: delete "_0.nrm" >> >>> > IFD [Indexer]: delete "_0.tis" >> >>> > IFD [Indexer]: delete "_0.fnm" >> >>> > IFD [Indexer]: delete "_0.tii" >> >>> > IFD [Indexer]: delete "_0.frq" >> >>> > IFD [Indexer]: delete "_0.fdx" >> >>> > IFD [Indexer]: delete "_0.prx" >> >>> > IFD [Indexer]: delete "_0.fdt" >> >>> > >> >>> > >> >>> > Peter >> >>> > >> >>> > On Mon, Oct 26, 2009 at 3:59 PM, Peter Keegan < >> peterlkee...@gmail.com >> >>> >wrote: >> >>> > >> >>> >> >> >>> >> >> >>> >> On Mon, Oct 26, 2009 at 3:00 PM, Michael McCandless < >> >>> >> luc...@mikemccandless.com> wrote: >> >>> >> >> >>> >>> On Mon, Oct 26, 2009 at 2:55 PM, Peter Keegan < >> peterlkee...@gmail.com >> >>> > >> >>> >>> wrote: >> >>> >>> > On Mon, Oct 26, 2009 at 2:50 PM, Michael McCandless < >> >>> >>> > luc...@mikemccandless.com> wrote: >> >>> >>> > >> >>> >>> >> On Mon, Oct 26, 2009 at 10:44 AM, Peter Keegan < >> >>> peterlkee...@gmail.com >> >>> >>> > >> >>> >>> >> wrote: >> >>> >>> >> > Even running in console mode, the exception is difficult to >> >>> >>> interpret. >> >>> >>> >> > Here's an exception that I think occurred during an add >> document, >> >>> >>> commit >> >>> >>> >> or >> >>> >>> >> > close: >> >>> >>> >> > doc counts differ for segment _g: field Reader shows 137 but >> >>> >>> segmentInfo >> >>> >>> >> > shows 5777 >> >>> >>> >> >> >>> >>> >> That's spooky. Do you have the full exception for this one? >> What >> >>> IO >> >>> >>> >> system are you running on? (Is it just a local drive on your >> >>> windows >> >>> >>> >> computer?) It's almost as if the IO system is not generating an >> >>> >>> >> IOException to Java when disk fills up. >> >>> >>> >> >> >>> >>> > >> >>> >>> > Index and code are all on a local drive. There is no other >> exception >> >>> >>> coming >> >>> >>> > back - just what I reported. >> >>> >>> >> >>> >>> But, you didn't report a traceback for this first one? >> >>> >>> >> >>> >> >> >>> >> Yes, I need to add some more printStackTrace calls. >> >>> >> >> >>> >> >> >>> >>> >> >>> >>> >> > I ensured that the disk space was low before updating the >> index. >> >>> >>> >> >> >>> >>> >> You mean, to intentionally test the disk-full case? >> >>> >>> >> >> >>> >>> > >> >>> >>> > Yes, that's right. >> >>> >>> >> >>> >>> OK. Can you turn on IndexWriter's infoStream, get this disk full >> / >> >>> >>> corruption to happen again, and post back the resulting output? >> Make >> >>> >>> sure your index first passes CheckIndex before starting (so we >> don't >> >>> >>> begin the test w/ any pre-existing index corruption). >> >>> >>> >> >>> >> >> >>> >> Good point about CheckIndex. I've already found 2 bad ones. I will >> >>> build >> >>> >> new indexes from scratch. This will take a while. >> >>> >> >> >>> >> >> >>> >>> >> > On another occasion, the exception was: >> >>> >>> >> > background merge hit exception: _0:C1080260 _1:C139 _2:C123 >> >>> _3:C107 >> >>> >>> >> _4:C126 >> >>> >>> >> > _5:C121 _6:C126 _7:C711 _8:C116 into _9 [optimize] >> >>> [mergeDocStores] >> >>> >>> >> >> >>> >>> >> In this case, the SegmentMerger was trying to open this >> segment, >> >>> but >> >>> >>> >> on attempting to read the first int from the fdx (fields index) >> >>> file >> >>> >>> >> for one of the segments, it hit EOF. >> >>> >>> >> >> >>> >>> >> This is also spooky -- this looks like index corruption, which >> >>> should >> >>> >>> >> never happen on hitting disk full. >> >>> >>> >> >> >>> >>> > >> >>> >>> > That's what I thought, too. Could Lucene be catching the >> IOException >> >>> and >> >>> >>> > turning it into a different exception? >> >>> >>> >> >>> >>> I think that's unlikely, but I guess possible. We have "disk >> full" >> >>> >>> tests in the unit tests, that throw an IOException at different >> times. >> >>> >>> >> >>> >>> What exact windows version are you using? The local drive is >> NTFS? >> >>> >>> >> >>> >> >> >>> >> Windows Server 2003 Enterprise x64 SP2. Local drive is NTFS >> >>> >> >> >>> >> >> >>> >>> >> >>> >>> Mike >> >>> >>> >> >>> >>> >> --------------------------------------------------------------------- >> >>> >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>> >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >>> >>> >> >>> >>> >> >>> >> >> >>> > >> >>> >> >>> --------------------------------------------------------------------- >> >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >>> >> >>> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >