On 12/8/2014 2:42 AM, Manohar Sripada wrote: > Can you please re-direct me to any wiki which describes (in detail) the > differences between MMapDirectoryFactory and NRTCachingDirectoryFactory? I > found this blog > <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html> very > helpful which describes about MMapDirectory. I want to know in detail about > NRTCachingFactory as well. > > Also, when I ran this rest request solr/admin/cores?action=STATUS, I got > the below result (pasted partial result only). I have set the > DirectoryFactory as NRTCachingDirectory in solrconfig.xml. But, it also > shows MMapDirectory in the below element. Does this means > NRTCachingDirectory is using MMapDirectory internally?? > > <str name="directory"> > org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/instance/solr/collection1_shard2_replica1/data/index > lockFactory=NativeFSLockFactory@/instance/solr/collection1_shard2_replica1/data/index; > maxCacheMB=48.0 maxMergeSizeMB=4.0)</str> > > What does maxCacheMB and maxMergeSizeMB indicate? How to control it?
NRTCachingDirectoryFactory creates instances of NRTCachingDirectory. This is is a wrapper on top of another Directory implementation. Normally it wraps MMapDirectory, so you get all the MMap advantages. The javadoc for NRTCachingDirectory says that it "Wraps a RAMDirectory around any provided delegate directory, to be used during NRT search." http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/store/NRTCachingDirectory.html Further down in that javadoc, the constructor documentation has this to say: "We will cache a newly created output if 1) it's a flush or a merge and the estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total cached bytes is <= maxCachedMB" Basically, if a newly created or merged segment is small enough, it won't be written to disk right away, it will be saved into RAM until another cacheable segment won't fit in available RAM and the oldest cached segment must be flushed to disk. Near Real Time search becomes easier. This DirectoryFactory implementation is default in 4.x, so as I understand it, it's critically important for Solr to have a replayable transaction log ... without it, any data that is cached in RAM will be lost if the program crashes or exits. The main Solr example *does* have the transaction log enabled. Thanks, Shawn