[ 
https://issues.apache.org/jira/browse/LUCENE-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796624#action_12796624
 ] 

Michael McCandless commented on LUCENE-2192:
--------------------------------------------

Given that you're creating a new IW to index the 2nd batch, and that this IW 
indeed discards the segment from the last one (via deleteAll), I think it's 
very unlikely that IW is incorrectly retaining your past docs.

deleteAll is indeed supposed to discard everything and give you a new starting 
index.  It does exactly the same thing as opening a new IW with create=true, 
except, deleteAll can be done without opening a new IW.

I think it's more likely that you're somehow accidentally creating docs with 2X 
the content.  Are you sharing Document or Field instances when you create your 
docs?  Can you post the actual code?

Or, alternatively, could you make this problem happen with a simplified 
standalone test case?  Ie, a code fragment that creates docs with random 
content instead of pulling from your DB?  That would help us isolate where the 
extra 2X content is coming from...

> Memory Leak 
> ------------
>
>                 Key: LUCENE-2192
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2192
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9
>            Reporter: Ramazan VARLIKLI
>
> Hi All ,
> I have been working on a problem with Lucene and now gave up after trying 
> many different possibilites which gives me a feeling that There is a bug on 
> this .
> The scenario is we have an CMS applicaton into which we add new content every 
> week , instead of updating the index which is a bit tricky, I prefer to 
> delete all index documents and add them again which is straightforward . The 
> problem is Lucene doesn't delete the old data somehow and increase the index 
> size every time during the update . I also profile it with java tools and see 
> that even if I close the IndexWriter class and sent it to Garbage Collector 
> it holds all the docs in the memory .
> Here is the code I use 
> Directory directory = new SimpleFSDirectory(new File(path));
> writer = new IndexWriter(directory, analyzer, 
> false,IndexWriter.MaxFieldLength.LIMITED);
> writer.deleteAll();
> //after adding docs close the indexwriter 
> writer.close();
> The above code invoked every time we need to update the index . I tried many 
> different scenario here to overcome the problem which includes physically 
> removing the index directory( see how desperate I am ) , optimizing , 
> flushing, commiting indexwriter, create=true parameter and so on . 
> Here is the index file size during creation. If I shutdown the application 
> and restart it , index size starts with 2,458 which is correct size.
> Any help will be appreciated
> _17.cfs   2,458 KB
> _18.cfs   3,990 KB
> _19.cfs  5,149 KB
> here is the Lucene logs during creationg of index files 3 times in a row 
> IFD [http-8080-1]: setInfoStream 
> deletionpolicy=org.apache.lucene.index.keeponlylastcommitdeletionpol...@6649
> IW 0 [http-8080-1]: setInfoStream: 
> dir=org.apache.lucene.store.simplefsdirect...@c:\Documents and 
> Settings\rvarlikli\workspace\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\wtpwebapps\Clipbank3.5\lucene
>  autoCommit=false 
> mergepolicy=org.apache.lucene.index.logbytesizemergepol...@3b626c 
> mergescheduler=org.apache.lucene.index.concurrentmergeschedu...@baa6ba 
> ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 
> maxFieldLength=10000 index=
> IW 0 [http-8080-1]: now flush at close
> IW 0 [http-8080-1]:   flush: segment=_17 docStoreSegment=_17 docStoreOffset=0 
> flushDocs=true flushDeletes=true flushDocStores=true numDocs=2765 
> numBufDelTerms=0
> IW 0 [http-8080-1]:   index before flush 
> IW 0 [http-8080-1]: DW: flush postings as segment _17 numDocs=2765
> IW 0 [http-8080-1]: DW: closeDocStore: 2 files to flush to segment _17 
> numDocs=2765
> IW 0 [http-8080-1]: DW:   oldRAMSize=7485440 newFlushedSize=2472818 
> docs/MB=1,172.473 new/old=33.035%
> IFD [http-8080-1]: now checkpoint "segments_1j" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: now checkpoint "segments_1j" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: delete "_17.fdx"
> IFD [http-8080-1]: delete "_17.tis"
> IFD [http-8080-1]: delete "_17.frq"
> IFD [http-8080-1]: delete "_17.nrm"
> IFD [http-8080-1]: delete "_17.fdt"
> IFD [http-8080-1]: delete "_17.fnm"
> IFD [http-8080-1]: delete "_17.tii"
> IFD [http-8080-1]: delete "_17.prx"
> IFD [http-8080-1]: now checkpoint "segments_1j" [1 segments ; isCommit = 
> false]
> IW 0 [http-8080-1]: LMP: findMerges: 1 segments
> IW 0 [http-8080-1]: LMP:   level 6.2247195 to 6.400742: 1 segments
> IW 0 [http-8080-1]: CMS: now merge
> IW 0 [http-8080-1]: CMS:   index: _17:c2765
> IW 0 [http-8080-1]: CMS:   no more merges pending; now return
> IW 0 [http-8080-1]: CMS: now merge
> IW 0 [http-8080-1]: CMS:   index: _17:c2765
> IW 0 [http-8080-1]: CMS:   no more merges pending; now return
> IW 0 [http-8080-1]: now call final commit()
> IW 0 [http-8080-1]: startCommit(): start sizeInBytes=0
> IW 0 [http-8080-1]: startCommit index=_17:c2765 changeCount=5
> IW 0 [http-8080-1]: now sync _17.cfs
> IW 0 [http-8080-1]: done all syncs
> IW 0 [http-8080-1]: commit: pendingCommit != null
> IW 0 [http-8080-1]: commit: wrote segments file "segments_1k"
> IFD [http-8080-1]: now checkpoint "segments_1k" [1 segments ; isCommit = true]
> IFD [http-8080-1]: deleteCommits: now decRef commit "segments_1j"
> IFD [http-8080-1]: delete "_16.cfs"
> IFD [http-8080-1]: delete "segments_1j"
> IW 0 [http-8080-1]: commit: done
> IW 0 [http-8080-1]: at close: _17:c2765
> IFD [http-8080-1]: setInfoStream 
> deletionpolicy=org.apache.lucene.index.keeponlylastcommitdeletionpol...@fb1ba7
> IW 1 [http-8080-1]: setInfoStream: 
> dir=org.apache.lucene.store.simplefsdirect...@c:\Documents and 
> Settings\rvarlikli\workspace\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\wtpwebapps\Clipbank3.5\lucene
>  autoCommit=false 
> mergepolicy=org.apache.lucene.index.logbytesizemergepol...@1d49559 
> mergescheduler=org.apache.lucene.index.concurrentmergeschedu...@1990e2d 
> ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 
> maxFieldLength=10000 index=
> IW 1 [http-8080-1]: now flush at close
> IW 1 [http-8080-1]:   flush: segment=_18 docStoreSegment=_18 docStoreOffset=0 
> flushDocs=true flushDeletes=true flushDocStores=true numDocs=2765 
> numBufDelTerms=0
> IW 1 [http-8080-1]:   index before flush 
> IW 1 [http-8080-1]: DW: flush postings as segment _18 numDocs=2765
> IW 1 [http-8080-1]: DW: closeDocStore: 2 files to flush to segment _18 
> numDocs=2765
> IW 1 [http-8080-1]: DW:   oldRAMSize=9517056 newFlushedSize=4042307 
> docs/MB=717.242 new/old=42.474%
> IFD [http-8080-1]: now checkpoint "segments_1k" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: now checkpoint "segments_1k" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: delete "_18.nrm"
> IFD [http-8080-1]: delete "_18.frq"
> IFD [http-8080-1]: delete "_18.fdx"
> IFD [http-8080-1]: delete "_18.tii"
> IFD [http-8080-1]: delete "_18.fdt"
> IFD [http-8080-1]: delete "_18.prx"
> IFD [http-8080-1]: delete "_18.fnm"
> IFD [http-8080-1]: delete "_18.tis"
> IFD [http-8080-1]: now checkpoint "segments_1k" [1 segments ; isCommit = 
> false]
> IW 1 [http-8080-1]: LMP: findMerges: 1 segments
> IW 1 [http-8080-1]: LMP:   level 6.2247195 to 6.6112633: 1 segments
> IW 1 [http-8080-1]: CMS: now merge
> IW 1 [http-8080-1]: CMS:   index: _18:c2765
> IW 1 [http-8080-1]: CMS:   no more merges pending; now return
> IW 1 [http-8080-1]: CMS: now merge
> IW 1 [http-8080-1]: CMS:   index: _18:c2765
> IW 1 [http-8080-1]: CMS:   no more merges pending; now return
> IW 1 [http-8080-1]: now call final commit()
> IW 1 [http-8080-1]: startCommit(): start sizeInBytes=0
> IW 1 [http-8080-1]: startCommit index=_18:c2765 changeCount=5
> IW 1 [http-8080-1]: now sync _18.cfs
> IW 1 [http-8080-1]: done all syncs
> IW 1 [http-8080-1]: commit: pendingCommit != null
> IW 1 [http-8080-1]: commit: wrote segments file "segments_1l"
> IFD [http-8080-1]: now checkpoint "segments_1l" [1 segments ; isCommit = true]
> IFD [http-8080-1]: deleteCommits: now decRef commit "segments_1k"
> IFD [http-8080-1]: delete "segments_1k"
> IFD [http-8080-1]: delete "_17.cfs"
> IW 1 [http-8080-1]: commit: done
> IW 1 [http-8080-1]: at close: _18:c2765
> IFD [http-8080-1]: setInfoStream 
> deletionpolicy=org.apache.lucene.index.keeponlylastcommitdeletionpol...@7ceec1
> IW 2 [http-8080-1]: setInfoStream: 
> dir=org.apache.lucene.store.simplefsdirect...@c:\Documents and 
> Settings\rvarlikli\workspace\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\wtpwebapps\Clipbank3.5\lucene
>  autoCommit=false 
> mergepolicy=org.apache.lucene.index.logbytesizemergepol...@1edacc 
> mergescheduler=org.apache.lucene.index.concurrentmergeschedu...@1ae9ba8 
> ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 
> maxFieldLength=10000 index=
> IW 2 [http-8080-1]: now flush at close
> IW 2 [http-8080-1]:   flush: segment=_19 docStoreSegment=_19 docStoreOffset=0 
> flushDocs=true flushDeletes=true flushDocStores=true numDocs=2765 
> numBufDelTerms=0
> IW 2 [http-8080-1]:   index before flush 
> IW 2 [http-8080-1]: DW: flush postings as segment _19 numDocs=2765
> IW 2 [http-8080-1]: DW: closeDocStore: 2 files to flush to segment _19 
> numDocs=2765
> IW 2 [http-8080-1]: DW:   oldRAMSize=11188224 newFlushedSize=5229106 
> docs/MB=554.457 new/old=46.738%
> IFD [http-8080-1]: now checkpoint "segments_1l" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: now checkpoint "segments_1l" [1 segments ; isCommit = 
> false]
> IFD [http-8080-1]: delete "_19.tis"
> IFD [http-8080-1]: delete "_19.prx"
> IFD [http-8080-1]: delete "_19.nrm"
> IFD [http-8080-1]: delete "_19.fnm"
> IFD [http-8080-1]: delete "_19.fdx"
> IFD [http-8080-1]: delete "_19.fdt"
> IFD [http-8080-1]: delete "_19.tii"
> IFD [http-8080-1]: delete "_19.frq"
> IFD [http-8080-1]: now checkpoint "segments_1l" [1 segments ; isCommit = 
> false]
> IW 2 [http-8080-1]: LMP: findMerges: 1 segments
> IW 2 [http-8080-1]: LMP:   level 6.2247195 to 6.722014: 1 segments
> IW 2 [http-8080-1]: CMS: now merge
> IW 2 [http-8080-1]: CMS:   index: _19:c2765
> IW 2 [http-8080-1]: CMS:   no more merges pending; now return
> IW 2 [http-8080-1]: CMS: now merge
> IW 2 [http-8080-1]: CMS:   index: _19:c2765
> IW 2 [http-8080-1]: CMS:   no more merges pending; now return
> IW 2 [http-8080-1]: now call final commit()
> IW 2 [http-8080-1]: startCommit(): start sizeInBytes=0
> IW 2 [http-8080-1]: startCommit index=_19:c2765 changeCount=5
> IW 2 [http-8080-1]: now sync _19.cfs
> IW 2 [http-8080-1]: done all syncs
> IW 2 [http-8080-1]: commit: pendingCommit != null
> IW 2 [http-8080-1]: commit: wrote segments file "segments_1m"
> IFD [http-8080-1]: now checkpoint "segments_1m" [1 segments ; isCommit = true]
> IFD [http-8080-1]: deleteCommits: now decRef commit "segments_1l"
> IFD [http-8080-1]: delete "_18.cfs"
> IFD [http-8080-1]: delete "segments_1l"
> IW 2 [http-8080-1]: commit: done
> IW 2 [http-8080-1]: at close: _19:c2765

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to