[ 
https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846956#action_12846956
 ] 

Earwin Burrfoot commented on LUCENE-2328:
-----------------------------------------

> Keeping track of not-yet-sync'd files instead of sync'd files is better, but 
> it still requires upkeep (ie when file is deleted you have to remove it) 
> because files can be opened, written to, closed, deleted without ever being 
> sync'd.
You can just skip this and handle FileNotFound exception when syncing. Have to 
handle it anyway, no guarantees some file won't be snatched from under your 
nose.

> This will over-sync in some situations.
Don't feel this is a serious problem. If you over-sync (in fact sync some files 
a little bit earlier than strictly required), in a few seconds you will 
under-sync, so total time is still the same.

But I feel you're somewhat missing the point. System-wide sync is not the 
original aim, it's just a possible byproduct of what is the original aim - to 
move sync tracking code from IW to Directory. And I don't see at all how adding 
batch-syncs achieves this.
If you're calling sync(Collection<String>), damn, you should keep that 
collection somewhere :) and it is supposed to be inside!

> IndexWriter.synced  field accumulates data leading to a Memory Leak
> -------------------------------------------------------------------
>
>                 Key: LUCENE-2328
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2328
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9.1, 2.9.2, 3.0, 3.0.1
>         Environment: all
>            Reporter: Gregor Kaczor
>            Priority: Minor
>             Fix For: 3.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I am running into a strange OutOfMemoryError. My small test application does
> index and delete some few files. This is repeated for 60k times. Optimization
> is run from every 2k times a file is indexed. Index size is 50KB. I did 
> analyze
> the HeapDumpFile and realized that IndexWriter.synced field occupied more than
> half of the heap. That field is a private HashSet without a getter. Its task 
> is
> to hold files which have been synced already.
> There are two calls to addAll and one call to add on synced but no remove or
> clear throughout the lifecycle of the IndexWriter instance.
> According to the Eclipse Memory Analyzer synced contains 32618 entries which
> look like file names "_e065_1.del" or "_e067.cfs"
> The index directory contains 10 files only.
> I guess synced is holding obsolete data 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to