Hmm, curious. I looked at the [large] infoStream output and I see segment _3ou7 present on init of IW, a few getReader calls referencing it, then a forceMerge that indeed merges it away, yet I do NOT see IW attempting deletion of its files.
And indeed I see plenty (too many: many times per second?) of commits after that, so the index itself is no longer referencing _3ou7. If you are failing to close all NRT readers then I would expect _3ou7 to be in the lsof output, but it's not. The NRT readers close method has logic that notifies IndexWriter when it's done "needing" the files, to emulate "delete on last close" semantics for filesystems like HDFS that don't do that ... it's possible something is wrong here. Can you set the (public, static) boolean IndexFileDeleter.VERBOSE_REF_COUNTS to true, and then re-generate this log? This causes IW to log the ref count of each file it's tracking ... I'll also add a bit more verbosity to IW when NRT readers are opened and close, for 5.4.0. Mike McCandless http://blog.mikemccandless.com On Wed, Nov 11, 2015 at 6:09 AM, Rob Audenaerde <rob.audenae...@gmail.com> wrote: > Hi all, > > I'm still debugging the growing-index size. I think closing index readers > might help (work in progress), but I can't really see them holding on to > files (at least, using lsof ). Restarting the application sheds some light, > I see logging on files that are no longer referenced. > > What I see is that there are files in the index-directory, that seem to > longer referenced.. > > I put the output of the infoStream online, because is it rather big (30MB > gzipped): http://www.audenaerde.org/lucene/merges.log.gz > > Output of lsof: (executed 'sudo lsof *' in the index directory ). This is > on an CentOS box (maybe that influences stuff as well?) > > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > java 30581 apache mem REG 253,0 3176094924 18880508 > _4gs5_Lucene50_0.dvd > java 30581 apache mem REG 253,0 505758610 18880546 _4gs5.fdt > java 30581 apache mem REG 253,0 369563337 18880631 > _4gs5_Lucene50_0.tim > java 30581 apache mem REG 253,0 176344058 18880623 > _4gs5_Lucene50_0.pos > java 30581 apache mem REG 253,0 378055201 18880606 > _4gs5_Lucene50_0.doc > java 30581 apache mem REG 253,0 372579599 18880400 > _4i5a_Lucene50_0.dvd > java 30581 apache mem REG 253,0 82017447 18880748 _4g37.cfs > java 30581 apache mem REG 253,0 85376507 18880721 _4fb3.cfs > java 30581 apache mem REG 253,0 363493917 18880533 > _4ct1_Lucene50_0.dvd > java 30581 apache mem REG 253,0 9421892 18880806 _4gjc.cfs > java 30581 apache mem REG 253,0 76877461 18880553 _4ct1.fdt > java 30581 apache mem REG 253,0 46271330 18880661 > _4ct1_Lucene50_0.tim > java 30581 apache mem REG 253,0 26911387 18880653 > _4ct1_Lucene50_0.pos > java 30581 apache mem REG 253,0 54678249 18880568 > _4ct1_Lucene50_0.doc > java 30581 apache mem REG 253,0 76556587 18880328 _4i5a.fdt > java 30581 apache mem REG 253,0 45032159 18880389 > _4i5a_Lucene50_0.tim > java 30581 apache mem REG 253,0 26486772 18880388 > _4i5a_Lucene50_0.pos > java 30581 apache mem REG 253,0 55411002 18880362 > _4i5a_Lucene50_0.doc > java 30581 apache mem REG 253,0 70484185 18880340 _4hkn.cfs > java 30581 apache mem REG 253,0 10873921 18880324 _4gpz.cfs > java 30581 apache mem REG 253,0 17230506 18880524 _4i11.cfs > java 30581 apache mem REG 253,0 6706969 18880575 _4i0t.cfs > java 30581 apache mem REG 253,0 15135578 18880624 _4i0i.cfs > java 30581 apache mem REG 253,0 15368310 18880717 _4hzp.cfs > java 30581 apache mem REG 253,0 5146140 18880583 _4hze.cfs > java 30581 apache mem REG 253,0 2917380 18880411 _4gs5.nvd > java 30581 apache mem REG 253,0 6871469 18880732 _4hod.cfs > java 30581 apache mem REG 253,0 2860341 18880495 _4i84.cfs > java 30581 apache mem REG 253,0 835726 18880660 _4i7z.cfs > java 30581 apache mem REG 253,0 1005595 18880648 _4i7w.cfs > java 30581 apache mem REG 253,0 5639672 18880401 _4i4o.cfs > java 30581 apache mem REG 253,0 4388371 18880440 _4i4a.cfs > java 30581 apache mem REG 253,0 1151845 18880512 _4i7v.cfs > java 30581 apache mem REG 253,0 941773 18880613 _4i7x.cfs > java 30581 apache mem REG 253,0 984023 18880588 _4i7o.cfs > java 30581 apache mem REG 253,0 1790005 18880619 _4i7y.cfs > java 30581 apache mem REG 253,0 466371 18880515 _4ct1.nvd > java 30581 apache mem REG 253,0 723280 18880573 _4i7q.cfs > java 30581 apache mem REG 253,0 806289 18880517 _4i7h.cfs > java 30581 apache mem REG 253,0 17362 18880520 _4i9s.cfs > java 30581 apache mem REG 253,0 698362 18880531 _4i9r.cfs > java 30581 apache mem REG 253,0 483215 18880406 _4i5a.nvd > java 30581 apache mem REG 253,0 14110 18880416 _4i9v.cfs > java 30581 apache mem REG 253,0 6121 18880412 _4i9t.cfs > java 30581 apache 30wW REG 253,0 0 18877901 write.lock > > Output of some of the biggest files in the index directory: > > -rw-r--r--. 1 apache apache 358684577 Nov 11 08:04 _4fjn.cfs > -rw-r--r--. 1 apache apache 363493917 Nov 11 07:54 _4ct1_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 369563337 Nov 11 08:06 _4gs5_Lucene50_0.tim > -rw-r--r--. 1 apache apache 372579599 Nov 11 08:09 _4i5a_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 378055201 Nov 11 08:06 _4gs5_Lucene50_0.doc > -rw-r--r--. 1 apache apache 427401813 Nov 10 08:14 _3ou7.cfs > -rw-r--r--. 1 apache apache 505758610 Nov 11 08:04 _4gs5.fdt > -rw-r--r--. 1 apache apache 1107391579 Nov 10 07:55 _3k3a_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 3176094924 Nov 11 08:10 _4gs5_Lucene50_0.dvd > > Note that the 3ou7 and 3k3a segments no longer appear to be in use? --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org