Hmm, which OS/filesystem is the index on? Which Directory impl are you using...?
Indeed CheckIndex sees the 5111 segments, so something seems amiss. The log was edited right? EG I can't see the IW open log entries... and IW 51 is missing (it goes straight from 50 to 52). Can you post the full log? Also, can you attach it (instead of inlining it), maybe gzip'd, so it doesn't get line-wrapped? Finally, can you run things a big longer, so we can see the pathological behavior emerge? The only odd thing I see in the log is that IW seems to be attempting to delete files (eg _1a.cfs) that are still "live", which is odd. That the delete did not succeed could just be that the file is open from a reader (do you have a reader open on the index?)... though, I would have expected to see "unable to remove file..." in the InfoStream output. Mike McCandless http://blog.mikemccandless.com On Thu, Mar 15, 2012 at 10:30 AM, Tim Bogaert <boggu...@gmail.com> wrote: > Hi Mike, > > Segment format: FORMAT_3_1 [Lucene 3.1+] > NumBadSegments: 0 > NumSegments: 5111 > Clean?: true > TotLoseDocCount: 0 > Partial: false > > > I reindexed and let the feed come in again...The logs of the result are > below (I just copied a few) after 10 minutes : > Segment format: FORMAT_3_1 [Lucene 3.1+] > NumBadSegments: 0 > NumSegments: 11 > Clean?: true > TotLoseDocCount: 0 > Partial: false > > What I noticed in the log was that _1a.cfs is getting deleted, but it seems > to be still in the directory : > > ls -al > -rw-r--r-- 1 web web 1444172 Mar 15 15:13 _17.fdt > -rw-r--r-- 1 web web 88804 Mar 15 15:13 _17.fdx > -rw-r--r-- 1 web web 142 Mar 15 15:13 _17.fnm > -rw-r--r-- 1 web web 4352933 Mar 15 15:13 _17.frq > -rw-r--r-- 1 web web 177604 Mar 15 15:13 _17.nrm > -rw-r--r-- 1 web web 5931105 Mar 15 15:13 _17.prx > -rw-r--r-- 1 web web 18159 Mar 15 15:13 _17.tii > -rw-r--r-- 1 web web 1421338 Mar 15 15:13 _17.tis > -rw-r--r-- 1 web web 36 Mar 15 15:14 _17_1.del > -rw-r--r-- 1 web web 379400 Mar 15 15:13 _18.cfs > -rw-r--r-- 1 web web 424376 Mar 15 15:13 _19.cfs > -rw-r--r-- 1 web web 463177 Mar 15 15:13 _1a.cfs > -rw-r--r-- 1 web web 108875 Mar 15 15:13 _1b.cfs > -rw-r--r-- 1 web web 4481 Mar 15 15:14 _1c.cfs > -rw-r--r-- 1 web web 4465 Mar 15 15:15 _1d.cfs > -rw-r--r-- 1 web web 4534 Mar 15 15:17 _1e.cfs > -rw-r--r-- 1 web web 3252 Mar 15 15:18 _1f.cfs > -rw-r--r-- 1 web web 3314 Mar 15 15:21 _1g.cfs > -rw-r--r-- 1 web web 2340 Mar 15 15:25 _1h.cfs > -rw-r--r-- 1 web web 3902 Mar 15 15:27 _1i.cfs > -rw-r--r-- 1 web web 1687 Mar 15 15:28 _1j.cfs > -rw-r--r-- 1 web web 20 Mar 15 15:28 segments.gen > -rw-r--r-- 1 web web 2735 Mar 15 15:28 segments_1h > > > Here is (part of) the log : > > 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_h(3.5):c1 size=0.007 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_b(3.5):c1 size=0.007 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_g(3.5):c1 size=0.006 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_c(3.5):c1 size=0.006 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_11(3.5):c1 size=0.005 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_12(3.5):c1 size=0.005 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > seg=_6(3.5):c1 size=0.005 MB [merging] [floored] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: TMP: > allowedSegmentCount=1 vs count=15 (eligible count=5) tooBigCount=0 > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: merge time 46 > msec for 10 docs > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now flush at close > waitForMerges=true > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: CMS: merge > thread: done > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now trigger flush > reason=explicit flush > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #2]: CMS: set > priority of merge thread Lucene Merge Thread #3 to 6 > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: start flush: > applyAllDeletes=true > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: index before flush > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1 > _18(3.5):c10 _17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1 > _h(3.5):c1 _11(3.5):c1 _12(3.5):c1 > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: DW: flush: no docs; > skipping > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: apply all deletes > during flush > BD 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: applyDeletes: no > deletes; skipping > BD 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: prune > sis=org.apache.lucene.index.SegmentInfos@1cad1e6 minGen=0 packetCount=0 > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: clearFlushPending > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS: now merge > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS: index: > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1 > _18(3.5):c10 _17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1 > _h(3.5):c1 _11(3.5):c1 _12(3.5):c1 > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: CMS: no more > merges pending; now return > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: waitForMerges > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merge store > matchedCount=10 vs 10 > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merged > segment size=0.076 MB vs estimate=0.103 MB > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: commitMerge: > _2(3.5):C23 _3(3.5):c7 _d(3.5):c1 _h(3.5):c1 _b(3.5):c1 _g(3.5):c1 > _c(3.5):c1 _11(3.5):c1 _12(3.5):c1 _6(3.5):c1 into _19 index=_0(3.5):C73 > _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _16(3.5):c10 _6(3.5):c1 _18(3.5):c10 > _17(3.5):c10 _b(3.5):c1 _c(3.5):c1 _d(3.5):c1 _g(3.5):c1 _h(3.5):c1 > _11(3.5):c1 _12(3.5):c1 > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: > commitMergeDeletes _2(3.5):C23 _3(3.5):c7 _d(3.5):c1 _h(3.5):c1 _b(3.5):c1 > _g(3.5):c1 _c(3.5):c1 _11(3.5):c1 _12(3.5):c1 _6(3.5):c1 into _19 > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: after commit: > _0(3.5):C73 _1(3.5):C70 _19(3.5):C38 _16(3.5):c10 _18(3.5):c10 _17(3.5):c10 > IFD [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: now checkpoint > "segments_15" [6 segments ; isCommit = false] > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: merge time 61 > msec for 38 docs > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: waitForMerges done > IW 50 [Thu Mar 15 15:25:23 MET 2012; Lucene Merge Thread #3]: CMS: merge > thread: done > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now call final > commit() > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: start > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: enter lock > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: already > prepared > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: > pendingCommit != null > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: wrote > segments file "segments_16" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: now checkpoint > "segments_16" [42 segments ; isCommit = true] > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: deleteCommits: now > decRef commit "segments_15" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "segments_15" > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: commit: done > IW 50 [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: at close: > _0(3.5):C73 _1(3.5):C70 _19(3.5):C38 _16(3.5):c10 _18(3.5):c10 _17(3.5):c10 > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.tii" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.tis" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_18.cfs" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.nrm" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fnm" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fdt" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_17.cfs" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.prx" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_16.cfs" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.fdx" > IFD [Thu Mar 15 15:25:23 MET 2012; pool-2-thread-1]: delete "_19.frq" > Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: applyDeletes: no deletes; > skipping > BD 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: prune > sis=org.apache.lucene.index.SegmentInfos@59d9fe minGen=0 packetCount=0 > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: clearFlushPending > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS: now merge > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS: index: > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _1a(3.5):C16 _17(3.5):c10 _19(3.5):c10 > _18(3.5):c10 > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: CMS: no more > merges pending; now return > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: waitForMerges > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: waitForMerges done > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: now call final > commit() > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: start > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: enter lock > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: already > prepared > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: > pendingCommit != null > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: wrote > segments file "segments_17" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: now checkpoint > "segments_17" [43 segments ; isCommit = true] > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: deleteCommits: now > decRef commit "segments_16" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "segments_16" > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: commit: done > IW 52 [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: at close: > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _1a(3.5):C16 _17(3.5):c10 _19(3.5):c10 > _18(3.5):c10 > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fdx" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.tis" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.frq" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.tii" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fnm" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.fdt" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_18.cfs" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.nrm" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_19.cfs" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_17.cfs" > IFD [Thu Mar 15 15:25:27 MET 2012; pool-2-thread-1]: delete "_1a.prx" > > IW 53 [Thu Mar 15 15:25:38 MET 2012; Lucene Merge Thread #3]: CMS: merge > thread: done > IW 53 [Thu Mar 15 15:25:38 MET 2012; Lucene Merge Thread #0]: CMS: set > priority of merge thread Lucene Merge Thread #3 to 6 > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now flush at close > waitForMerges=true > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now trigger flush > reason=explicit flush > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: start flush: > applyAllDeletes=true > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: index before flush > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10 > _1a(3.5):c10 _19(3.5):c10 > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: DW: flush: no docs; > skipping > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: apply all deletes > during flush > BD 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: applyDeletes: no > deletes; skipping > BD 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: prune > sis=org.apache.lucene.index.SegmentInfos@e91b11 minGen=0 packetCount=0 > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: clearFlushPending > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS: now merge > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS: index: > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10 > _1a(3.5):c10 _19(3.5):c10 > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: CMS: no more > merges pending; now return > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: waitForMerges > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: waitForMerges done > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now call final > commit() > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: start > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: enter lock > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: already > prepared > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: > pendingCommit != null > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: wrote > segments file "segments_18" > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: now checkpoint > "segments_18" [44 segments ; isCommit = true] > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: deleteCommits: now > decRef commit "segments_17" > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "segments_17" > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: commit: done > IW 53 [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: at close: > _0(3.5):C73 _1(3.5):C70 _2(3.5):C23 _3(3.5):c7 _18(3.5):c10 _1b(3.5):c10 > _1a(3.5):c10 _19(3.5):c10 > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_1b.cfs" > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_18.cfs" > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_19.cfs" > IFD [Thu Mar 15 15:25:38 MET 2012; pool-2-thread-1]: delete "_1a.cfs" > > Kind regards, > Bogguard > > On Thu, Mar 15, 2012 at 1:21 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hmm, that's odd... >> >> Can you set IndexWriter's infoStream and then capture the output while >> doing the small writes every few seconds and post back? >> >> If you run CheckIndex on the index does it also report ~3000 segments? >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Thu, Mar 15, 2012 at 7:14 AM, Tim Bogaert <boggu...@gmail.com> wrote: >> > Hi, >> > >> > We migrated our project to Lucene 3.5 and used the TieredMergePolicy with >> > all its defaults. >> > Our project contains a relatively small index (+-90MB) and gets small >> > writes every few seconds. >> > The behaviour we see is that when we reindex everything works great and >> > performant, >> > but when we keep the index running for some days we see a lot (+-3000) of >> > very small .cfs files (between 1KB and 10KB) are being generated and the >> > performance drops significantly. >> > Reïndexing solves the problem, but this shouldn't be the solution. >> > >> > Is there a way to change the behaviour of the .cfs files. I noticed you >> can >> > fully disable it, but I read that this isn't adviced. >> > >> > Kind regards, >> > Bogguard >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org