Memory/cache aside, the fundamental Solr issue is that the Suggester build operation will read the entire index, even though very few docs have the relevant fields.
Is there a way to set a 'fq' on the Suggester build? java.lang.Thread.State: RUNNABLE at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135) at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState.document(CompressingStoredFieldsReader.java:560) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.document(CompressingStoredFieldsReader.java:576) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:583) at org.apache.lucene.index.CodecReader.document(CodecReader.java:88) at org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411) at org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411) at org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:118) at org.apache.lucene.index.IndexReader.document(IndexReader.java:381) at org.apache.lucene.search.suggest.DocumentDictionary$DocumentInputIterator.next(DocumentDictionary.java:165) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.build(AnalyzingInfixSuggester.java:300) - locked <0x00000004b8f29260> (a java.lang.Object) at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:190) at org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:178) at org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:179) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296) On 3 May 2017 at 12:47, Damien Kamerman <dami...@gmail.com> wrote: > Thanks Shawn, I'll have to look closer into this. > > On 3 May 2017 at 12:10, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 5/2/2017 6:46 PM, Damien Kamerman wrote: >> > Shalin, yes I think it's a case of the Suggester build hitting the index >> > all at once. I'm thinking it's hitting all docs, even the ones without >> > fields relevant to the suggester. >> > >> > Shawn, I am using ZFS, though I think it's comparable to other setups. >> > mmap() should still be faster, while the ZFS ARC cache may prefer more >> > memory that other OS disk caches. >> > >> > So, it sounds like I enough memory/swap to hold the entire index. When >> will >> > the memory be released? On a commit? >> > https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/ >> store/MMapDirectory.html >> > talks about a bug on the close(). >> >> What I'm going to describe below is how things *normally* work on most >> operating systems (think Linux or Windows) with most filesystems. If >> ZFS is different, and it sounds like it might be, then that's something >> for you to discuss with Oracle. >> >> Normally, MMap doesn't *allocate* any memory -- so there's nothing to >> release later. It asks the operating system to map the file's contents >> to a section of virtual memory, and then the program accesses that >> memory block directly. >> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html >> >> A typical OS takes care of translating accesses to MMap virtual memory >> into disk accesses, and uses available system memory to cache the data >> that's read so a subsequent access of the same data is super fast. >> >> On most operating systems, memory in the disk cache is always available >> to programs that request it for an allocation. >> >> ZFS uses a completely separate piece of memory for caching -- the ARC >> cache. I do not know if the OS is able to release memory from that >> cache when a program requests it. My experience with ZFS on Linux (not >> with Solr) suggests that the ARC cache holds onto memory a lot tighter >> than the standard OS disk cache. ZFS on Solaris might be a different >> animal, though. >> >> I'm finding conflicting information regarding MMap problems on ZFS. >> Some sources say that memory usage is doubled (data in both the standard >> page cache and the arc cache), some say that this is not a general >> problem. This is probably a question for Oracle to answer. >> >> You don't want to count swap space when looking at how much memory you >> have. Swap performance is REALLY bad. >> >> Thanks, >> Shawn >> >> >