[ https://issues.apache.org/jira/browse/LUCENE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101567#comment-13101567 ]
Michael McCandless commented on LUCENE-3425: -------------------------------------------- Also, a quick win on trunk is to use IOCtx's FlushInfo.estimatedSegmentSize to decide up front whether to try caching or not. Ie if the to-be-flushed segment is too large we should not cache it. > NRT Caching Dir to allow for exact memory usage, better buffer allocation and > "global" cross indices control > ------------------------------------------------------------------------------------------------------------ > > Key: LUCENE-3425 > URL: https://issues.apache.org/jira/browse/LUCENE-3425 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Shay Banon > > A discussion on IRC raised several improvements that can be made to NRT > caching dir. Some of the problems it currently has are: > 1. Not explicitly controlling the memory usage, which can result in overusing > memory (for example, large new segments being committed because refreshing is > too far behind). > 2. Heap fragmentation because of constant allocation of (probably promoted to > old gen) byte buffers. > 3. Not being able to control the memory usage across indices for multi index > usage within a single JVM. > A suggested solution (which still needs to be ironed out) is to have a > BufferAllocator that controls allocation of byte[], and allow to return > unused byte[] to it. It will have a cap on the size of memory it allows to be > allocated. > The NRT caching dir will use the allocator, which can either be provided (for > usage across several indices) or created internally. The caching dir will > also create a wrapped IndexOutput, that will flush to the main dir if the > allocator can no longer provide byte[] (exhausted). > When a file is "flushed" from the cache to the main directory, it will return > all the currently allocated byte[] to the BufferAllocator to be reused by > other "files". -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org