Thank you all, I will further fix and investigate! On Nov 14, 2015 10:00, "Uwe Schindler" <u...@thetaphi.de> wrote:
> I agree. On Linux it is impossible that MMapDirectory is the reason! Only > on windows you cannot delete still open/mapped files. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > Sent: Friday, November 13, 2015 8:30 PM > > To: Lucene Users <java-user@lucene.apache.org> > > Subject: Re: debugging growing index size > > > > So with MMapDir at defaults (unmap is enabled) you see old files, with > > no open file handles as reported by lsof, still existing in your index > > directory, taking lots of space. > > > > But with NIOFSDirectory the issue doesn't happen? Are you sure? > > > > I'll look at the 6.6 GB infoStream to see what it says about the ref > counts. > > > > Did you fix the issue in your app where you're not closing all opened > > NRT readers? > > > > Mike McCandless > > > > http://blog.mikemccandless.com > > > > > > On Fri, Nov 13, 2015 at 12:22 PM, Rob Audenaerde > > <rob.audenae...@gmail.com> wrote: > > > I haven't disabled unmapping, and I am running out-of-the-box > > > FSDirectory.open(). As I can see it tries to pick MMap. For the test I > > > explicitly constructed a NIOFSDIrectoryReader > > > > > > OS is (from the top of my head) CentOS 6.x, Java 1.8.0u33. I can check > > > later for more details. > > > On Nov 13, 2015 18:07, "Uwe Schindler" <u...@thetaphi.de> wrote: > > > > > >> Hi, > > >> > > >> Lucene has the workaround, so it should not happen, UNLESS you > > explicitly > > >> disable the hack using MMapDirectory#setEnableUnmap(false). > > >> > > >> Uwe > > >> > > >> ----- > > >> Uwe Schindler > > >> H.-H.-Meier-Allee 63, D-28213 Bremen > > >> http://www.thetaphi.de > > >> eMail: u...@thetaphi.de > > >> > > >> > -----Original Message----- > > >> > From: will martin [mailto:wmartin...@gmail.com] > > >> > Sent: Friday, November 13, 2015 6:04 PM > > >> > To: java-user@lucene.apache.org > > >> > Subject: Re: debugging growing index size > > >> > > > >> > Hi Rob: > > >> > > > >> > > > >> > Doesn’t this look like known SE issue JDK-4724038 and discussed by > > Peter > > >> > Levart and Uwe Schindler on a lucene-dev thread 9/9/2015? > > >> > > > >> > MappedByteBuffer …. what OS are you on Rob? What JVM? > > >> > > > >> > http://bugs.java.com/view_bug.do?bug_id=4724038 > > >> > > > >> > http://mail-archives.apache.org/mod_mbox/lucene- > > >> > dev/201509.mbox/%3c55f0461a.2070...@gmail.com%3E > > >> > > > >> > hth > > >> > -will > > >> > > > >> > > > >> > > > >> > > On Nov 13, 2015, at 11:23 AM, Rob Audenaerde > > >> > <rob.audenae...@gmail.com> wrote: > > >> > > > > >> > > I'm currently running using NIOFS. It seems to prevent the issue > from > > >> > > appearing. > > >> > > > > >> > > This is a second run (with applied deletes etc) > > >> > > > > >> > > raudenaerd@:/<6>index/index$sudo ls -lSra *.dvd > > >> > > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 > _y_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 > > _xod_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 > > _110e_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 > > _12r5_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 > > _y0s_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 > > _z20_Lucene50_0.dvd > > >> > > > > >> > > raudenaerde:/<6>index/index$sudo ls -lSaa *.dvd > > >> > > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 > > _z20_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 > > _y0s_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 > > _12r5_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 > > _110e_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 > > _xod_Lucene50_0.dvd > > >> > > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 > _y_Lucene50_0.dvd > > >> > > > > >> > > > > >> > > > > >> > > On Thu, Nov 12, 2015 at 3:40 PM, Michael McCandless < > > >> > > luc...@mikemccandless.com> wrote: > > >> > > > > >> > >> Hi Rob, > > >> > >> > > >> > >> A couple more things: > > >> > >> > > >> > >> Can you print the value of MMapDirectory.UNMAP_SUPPORTED? > > >> > >> > > >> > >> Also, can you try your test using NIOFSDirectory instead? > Curious if > > >> > >> that changes things... > > >> > >> > > >> > >> Mike McCandless > > >> > >> > > >> > >> http://blog.mikemccandless.com > > >> > >> > > >> > >> > > >> > >> On Thu, Nov 12, 2015 at 7:28 AM, Rob Audenaerde > > >> > >> <rob.audenae...@gmail.com> wrote: > > >> > >>> Curious indeed! > > >> > >>> > > >> > >>> I will turn on the IndexFileDeleter.VERBOSE_REF_COUNTS and > > recreate > > >> > the > > >> > >>> logs. Will get back with them in a day hopefully. > > >> > >>> > > >> > >>> Thanks for the extra logging! > > >> > >>> > > >> > >>> -Rob > > >> > >>> > > >> > >>> On Thu, Nov 12, 2015 at 11:34 AM, Michael McCandless < > > >> > >>> luc...@mikemccandless.com> wrote: > > >> > >>> > > >> > >>>> Hmm, curious. > > >> > >>>> > > >> > >>>> I looked at the [large] infoStream output and I see segment > _3ou7 > > >> > >>>> present on init of IW, a few getReader calls referencing it, > then a > > >> > >>>> forceMerge that indeed merges it away, yet I do NOT see IW > > >> > attempting > > >> > >>>> deletion of its files. > > >> > >>>> > > >> > >>>> And indeed I see plenty (too many: many times per second?) of > > >> > commits > > >> > >>>> after that, so the index itself is no longer referencing _3ou7. > > >> > >>>> > > >> > >>>> If you are failing to close all NRT readers then I would expect > > >> _3ou7 > > >> > >>>> to be in the lsof output, but it's not. > > >> > >>>> > > >> > >>>> The NRT readers close method has logic that notifies > IndexWriter > > >> when > > >> > >>>> it's done "needing" the files, to emulate "delete on last > close" > > >> > >>>> semantics for filesystems like HDFS that don't do that ... it's > > >> > >>>> possible something is wrong here. > > >> > >>>> > > >> > >>>> Can you set the (public, static) boolean > > >> > >>>> IndexFileDeleter.VERBOSE_REF_COUNTS to true, and then re- > > >> > generate this > > >> > >>>> log? This causes IW to log the ref count of each file it's > tracking > > >> > >>>> ... > > >> > >>>> > > >> > >>>> I'll also add a bit more verbosity to IW when NRT readers are > > opened > > >> > >>>> and close, for 5.4.0. > > >> > >>>> > > >> > >>>> Mike McCandless > > >> > >>>> > > >> > >>>> http://blog.mikemccandless.com > > >> > >>>> > > >> > >>>> > > >> > >>>> On Wed, Nov 11, 2015 at 6:09 AM, Rob Audenaerde > > >> > >>>> <rob.audenae...@gmail.com> wrote: > > >> > >>>>> Hi all, > > >> > >>>>> > > >> > >>>>> I'm still debugging the growing-index size. I think closing > index > > >> > >> readers > > >> > >>>>> might help (work in progress), but I can't really see them > holding > > >> on > > >> > >> to > > >> > >>>>> files (at least, using lsof ). Restarting the application > sheds > > >> some > > >> > >>>> light, > > >> > >>>>> I see logging on files that are no longer referenced. > > >> > >>>>> > > >> > >>>>> What I see is that there are files in the index-directory, > that > > >> seem > > >> > >> to > > >> > >>>>> longer referenced.. > > >> > >>>>> > > >> > >>>>> I put the output of the infoStream online, because is it > rather big > > >> > >> (30MB > > >> > >>>>> gzipped): http://www.audenaerde.org/lucene/merges.log.gz > > >> > >>>>> > > >> > >>>>> Output of lsof: (executed 'sudo lsof *' in the index > directory ). > > >> > >> This > > >> > >>>> is > > >> > >>>>> on an CentOS box (maybe that influences stuff as well?) > > >> > >>>>> > > >> > >>>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE > > NAME > > >> > >>>>> java 30581 apache mem REG 253,0 3176094924 18880508 > > >> > >>>>> _4gs5_Lucene50_0.dvd > > >> > >>>>> java 30581 apache mem REG 253,0 505758610 18880546 > > >> _4gs5.fdt > > >> > >>>>> java 30581 apache mem REG 253,0 369563337 18880631 > > >> > >>>>> _4gs5_Lucene50_0.tim > > >> > >>>>> java 30581 apache mem REG 253,0 176344058 18880623 > > >> > >>>>> _4gs5_Lucene50_0.pos > > >> > >>>>> java 30581 apache mem REG 253,0 378055201 18880606 > > >> > >>>>> _4gs5_Lucene50_0.doc > > >> > >>>>> java 30581 apache mem REG 253,0 372579599 18880400 > > >> > >>>>> _4i5a_Lucene50_0.dvd > > >> > >>>>> java 30581 apache mem REG 253,0 82017447 18880748 > > >> _4g37.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 85376507 18880721 > > >> _4fb3.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 363493917 18880533 > > >> > >>>>> _4ct1_Lucene50_0.dvd > > >> > >>>>> java 30581 apache mem REG 253,0 9421892 18880806 > > >> _4gjc.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 76877461 18880553 > > >> _4ct1.fdt > > >> > >>>>> java 30581 apache mem REG 253,0 46271330 18880661 > > >> > >>>>> _4ct1_Lucene50_0.tim > > >> > >>>>> java 30581 apache mem REG 253,0 26911387 18880653 > > >> > >>>>> _4ct1_Lucene50_0.pos > > >> > >>>>> java 30581 apache mem REG 253,0 54678249 18880568 > > >> > >>>>> _4ct1_Lucene50_0.doc > > >> > >>>>> java 30581 apache mem REG 253,0 76556587 18880328 > > >> _4i5a.fdt > > >> > >>>>> java 30581 apache mem REG 253,0 45032159 18880389 > > >> > >>>>> _4i5a_Lucene50_0.tim > > >> > >>>>> java 30581 apache mem REG 253,0 26486772 18880388 > > >> > >>>>> _4i5a_Lucene50_0.pos > > >> > >>>>> java 30581 apache mem REG 253,0 55411002 18880362 > > >> > >>>>> _4i5a_Lucene50_0.doc > > >> > >>>>> java 30581 apache mem REG 253,0 70484185 18880340 > > >> _4hkn.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 10873921 18880324 > > >> _4gpz.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 17230506 18880524 > > >> _4i11.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 6706969 18880575 > > >> _4i0t.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 15135578 18880624 > > >> _4i0i.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 15368310 18880717 > > >> _4hzp.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 5146140 18880583 > > >> _4hze.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 2917380 18880411 > > >> _4gs5.nvd > > >> > >>>>> java 30581 apache mem REG 253,0 6871469 18880732 > > >> _4hod.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 2860341 18880495 > > >> _4i84.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 835726 18880660 > > >> _4i7z.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 1005595 18880648 > > >> _4i7w.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 5639672 18880401 > > >> _4i4o.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 4388371 18880440 > > >> _4i4a.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 1151845 18880512 > > >> _4i7v.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 941773 18880613 > > >> _4i7x.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 984023 18880588 > > >> _4i7o.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 1790005 18880619 > > >> _4i7y.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 466371 18880515 > > >> _4ct1.nvd > > >> > >>>>> java 30581 apache mem REG 253,0 723280 18880573 > > >> _4i7q.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 806289 18880517 > > >> _4i7h.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 17362 18880520 > > >> _4i9s.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 698362 18880531 > > >> _4i9r.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 483215 18880406 > > >> _4i5a.nvd > > >> > >>>>> java 30581 apache mem REG 253,0 14110 18880416 > > >> _4i9v.cfs > > >> > >>>>> java 30581 apache mem REG 253,0 6121 18880412 > > >> _4i9t.cfs > > >> > >>>>> java 30581 apache 30wW REG 253,0 0 18877901 > > >> write.lock > > >> > >>>>> > > >> > >>>>> Output of some of the biggest files in the index directory: > > >> > >>>>> > > >> > >>>>> -rw-r--r--. 1 apache apache 358684577 Nov 11 08:04 _4fjn.cfs > > >> > >>>>> -rw-r--r--. 1 apache apache 363493917 Nov 11 07:54 > > >> > >> _4ct1_Lucene50_0.dvd > > >> > >>>>> -rw-r--r--. 1 apache apache 369563337 Nov 11 08:06 > > >> > >> _4gs5_Lucene50_0.tim > > >> > >>>>> -rw-r--r--. 1 apache apache 372579599 Nov 11 08:09 > > >> > >> _4i5a_Lucene50_0.dvd > > >> > >>>>> -rw-r--r--. 1 apache apache 378055201 Nov 11 08:06 > > >> > >> _4gs5_Lucene50_0.doc > > >> > >>>>> -rw-r--r--. 1 apache apache 427401813 Nov 10 08:14 _3ou7.cfs > > >> > >>>>> -rw-r--r--. 1 apache apache 505758610 Nov 11 08:04 _4gs5.fdt > > >> > >>>>> -rw-r--r--. 1 apache apache 1107391579 Nov 10 07:55 > > >> > >> _3k3a_Lucene50_0.dvd > > >> > >>>>> -rw-r--r--. 1 apache apache 3176094924 Nov 11 08:10 > > >> > >> _4gs5_Lucene50_0.dvd > > >> > >>>>> > > >> > >>>>> Note that the 3ou7 and 3k3a segments no longer appear to be in > > use? > > >> > >>>> > > >> > >>>> > > >> --------------------------------------------------------------------- > > >> > >>>> To unsubscribe, e-mail: java-user- > > unsubscr...@lucene.apache.org > > >> > >>>> For additional commands, e-mail: java-user- > > h...@lucene.apache.org > > >> > >>>> > > >> > >>>> > > >> > >> > > >> > >> > --------------------------------------------------------------------- > > >> > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> > >> For additional commands, e-mail: > java-user-h...@lucene.apache.org > > >> > >> > > >> > >> > > >> > > > >> > > > >> > > --------------------------------------------------------------------- > > >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> > For additional commands, e-mail: java-user-h...@lucene.apache.org > > >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > >> > > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >