On 12/8/2014 2:42 AM, Manohar Sripada wrote:
> Can you please re-direct me to any wiki which describes (in detail) the
> differences between MMapDirectoryFactory and NRTCachingDirectoryFactory? I
> found this blog
> <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html> very
> helpful which describes about MMapDirectory. I want to know in detail about
> NRTCachingFactory as well.
> 
> Also, when I ran this rest request solr/admin/cores?action=STATUS, I got
> the below result (pasted partial result only). I have set the
> DirectoryFactory as NRTCachingDirectory in solrconfig.xml. But, it also
> shows MMapDirectory in the below element. Does this means
> NRTCachingDirectory is using MMapDirectory internally??
> 
> <str name="directory">
> org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/instance/solr/collection1_shard2_replica1/data/index
> lockFactory=NativeFSLockFactory@/instance/solr/collection1_shard2_replica1/data/index;
> maxCacheMB=48.0 maxMergeSizeMB=4.0)</str>
> 
> What does maxCacheMB and maxMergeSizeMB indicate? How to control it?

NRTCachingDirectoryFactory creates instances of NRTCachingDirectory.
This is is a wrapper on top of another Directory implementation.
Normally it wraps MMapDirectory, so you get all the MMap advantages.
The javadoc for NRTCachingDirectory says that it "Wraps a RAMDirectory
around any provided delegate directory, to be used during NRT search."

http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/store/NRTCachingDirectory.html

Further down in that javadoc, the constructor documentation has this to
say: "We will cache a newly created output if 1) it's a flush or a merge
and the estimated size of the merged segment is <= maxMergeSizeMB, and
2) the total cached bytes is <= maxCachedMB"

Basically, if a newly created or merged segment is small enough, it
won't be written to disk right away, it will be saved into RAM until
another cacheable segment won't fit in available RAM and the oldest
cached segment must be flushed to disk.  Near Real Time search becomes
easier.

This DirectoryFactory implementation is default in 4.x, so as I
understand it, it's critically important for Solr to have a replayable
transaction log ... without it, any data that is cached in RAM will be
lost if the program crashes or exits.  The main Solr example *does* have
the transaction log enabled.

Thanks,
Shawn

Reply via email to