On 5/2/2017 6:46 PM, Damien Kamerman wrote: > Shalin, yes I think it's a case of the Suggester build hitting the index > all at once. I'm thinking it's hitting all docs, even the ones without > fields relevant to the suggester. > > Shawn, I am using ZFS, though I think it's comparable to other setups. > mmap() should still be faster, while the ZFS ARC cache may prefer more > memory that other OS disk caches. > > So, it sounds like I enough memory/swap to hold the entire index. When will > the memory be released? On a commit? > https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/store/MMapDirectory.html > talks about a bug on the close().
What I'm going to describe below is how things *normally* work on most operating systems (think Linux or Windows) with most filesystems. If ZFS is different, and it sounds like it might be, then that's something for you to discuss with Oracle. Normally, MMap doesn't *allocate* any memory -- so there's nothing to release later. It asks the operating system to map the file's contents to a section of virtual memory, and then the program accesses that memory block directly. http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html A typical OS takes care of translating accesses to MMap virtual memory into disk accesses, and uses available system memory to cache the data that's read so a subsequent access of the same data is super fast. On most operating systems, memory in the disk cache is always available to programs that request it for an allocation. ZFS uses a completely separate piece of memory for caching -- the ARC cache. I do not know if the OS is able to release memory from that cache when a program requests it. My experience with ZFS on Linux (not with Solr) suggests that the ARC cache holds onto memory a lot tighter than the standard OS disk cache. ZFS on Solaris might be a different animal, though. I'm finding conflicting information regarding MMap problems on ZFS. Some sources say that memory usage is doubled (data in both the standard page cache and the arc cache), some say that this is not a general problem. This is probably a question for Oracle to answer. You don't want to count swap space when looking at how much memory you have. Swap performance is REALLY bad. Thanks, Shawn