Just an idea, how about taking a dump with jmap and using MemoryAnalyzerTool to see what is going on?
Regards Bernd Am 24.08.2017 um 11:49 schrieb Markus Jelsma: > Hello Shalin, > > Yes, the main search index has DocValues on just a few fields, they are used > for facetting and function queries, we started using DocValues when 6.0 was > released. Most fields are content fields for many languages. I don't think it > is going to be DocValues because the max shared memory consumption is reduced > my searching on fields fewer languages, and by disabling highlighting, both > not using DocValues. > > But it tried the option regardless, and because i didn't know about it. But > it appears the option does exactly nothing. First is without any > configuration for preload, second is with preload=true, third is preload=false > > 14220 markus 20 0 14,675g 1,508g 62800 S 1,0 9,6 0:36.98 java > 14803 markus 20 0 14,674g 1,537g 63248 S 0,0 9,8 0:34.50 java > 15324 markus 20 0 14,674g 1,409g 63152 S 0,0 9,0 0:35.50 java > > Please correct my config is i am wrong: > > <directoryFactory name="DirectoryFactory" > class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"> > <bool name="preload">false</bool> > </directoryFactory> > > NRTCachingDirectoryFactory implies MMapDirectory right? > > Thanks, > Markus > > -----Original message----- >> From:Shalin Shekhar Mangar <shalinman...@gmail.com> >> Sent: Thursday 24th August 2017 5:51 >> To: solr-user@lucene.apache.org >> Subject: Re: Solr uses lots of shared memory! >> >> Very interesting. Do you have many DocValue fields? Have you always >> had them i.e. did you see this problem before you turned on DocValues? >> The DocValue fields are in a separate file and they will be memory >> mapped on demand. One thing you can experiment with is to use >> preload=true option on the MMapDirectoryFactory which will mmap all >> index files on startup [1]. Once you do this, and if you still notice >> shared memory leakage then it may be a genuine memory leak that we >> should investigate. >> >> [1] - >> http://lucene.apache.org/solr/guide/6_6/datadir-and-directoryfactory-in-solrconfig.html#DataDirandDirectoryFactoryinSolrConfig-SpecifyingtheDirectoryFactoryForYourIndex >> >> On Wed, Aug 23, 2017 at 7:02 PM, Markus Jelsma >> <markus.jel...@openindex.io> wrote: >>> I do not think it is a problem of reporting after watching top after >>> restart of some Solr instances, it dropped back to `normal`, around 350 MB, >>> which i think it high to but anyway. >>> >>> Two hours later, the restarted nodes are slowly increasing shared memory >>> consumption to about 1500 MB now. I don't understand why shared memory >>> usage should/would increase slowly over time, it makes little sense to me >>> and i cannot remember Solr doing this in the past ten years. >>> >>> But it seems to correlate to index size on disk, these main text search >>> nodes have an index of around 16 GB and up 3 GB of shared memory after a >>> few days. Logs nodes up to 800 MB index size and 320 MB of shared memory, >>> the low latency nodes have four different cores that make up just over 100 >>> MB index size, shared memory consumption is just 22 MB, which seems more >>> reasonable for the case of shared memory. >>> >>> I can also force Solr to 'leak' shared memory just by sending queries to >>> it. My freshly restarted local node used 68 MB shared memory at startup. >>> Two minutes and 25.000 queries later it was already 2748 MB! At first there >>> is a very sharp increase to 2000, then it takes almost two minutes more to >>> increase to 2748. I can decrease the maximum shared memory usage to 1200 if >>> i query (via edismax) only on fields of one language instead of 25 orso. I >>> can decrease it as well further if i disable highlighting (HUH?) but still >>> query on all fields. >>> >>> * We have tried patching Java's ByteBuffer [1] because it seemed to fit the >>> problems, it does not fix it. >>> * We have also removed all our custom plugins, so it has become a vanilla >>> Solr 6.6 just with our stripped down schema and solrconfig, it neither >>> fixes it. >>> >>> Why does it slowly increase over time? >>> Why does it appear to correlate to index size? >>> Is anyone else seeing this on their 6.6 cloud production or local machines? >>> >>> Thanks, >>> Markus >>> >>> [1]: http://www.evanjones.ca/java-bytebuffer-leak.html >>> >>> -----Original message----- >>>> From:Shawn Heisey <apa...@elyograg.org> >>>> Sent: Tuesday 22nd August 2017 17:32 >>>> To: solr-user@lucene.apache.org >>>> Subject: Re: Solr uses lots of shared memory! >>>> >>>> On 8/22/2017 7:24 AM, Markus Jelsma wrote: >>>>> I have never seen this before, one of our collections, all nodes eating >>>>> tons of shared memory! >>>>> >>>>> Here's one of the nodes: >>>>> 10497 solr 20 0 19.439g 4.505g 3.139g S 1.0 57.8 2511:46 java >>>>> >>>>> RSS is roughly equal to heap size + usual off-heap space + shared memory. >>>>> Virtual is equal to RSS and index size on disk. For two other >>>>> collections, the nodes use shared memory as expected, in the MB range. >>>>> >>>>> How can Solr, this collection, use so much shared memory? Why? >>>> >>>> I've seen this on my own servers at work, and when I add up a subset of >>>> the memory numbers I can see from the system, it ends up being more >>>> memory than I even have in the server. >>>> >>>> I suspect there is something odd going on in how Java reports memory >>>> usage to the OS, or maybe a glitch in how Linux interprets Java's memory >>>> usage. At some point in the past, numbers were reported correctly. I >>>> do not know if the change came about because of a Solr upgrade, because >>>> of a Java upgrade, or because of an OS kernel upgrade. All three were >>>> upgraded between when I know the numbers looked right and when I noticed >>>> they were wrong. >>>> >>>> https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0 >>>> >>>> This screenshot shows that Solr is using 17GB of memory, 41.45GB of >>>> memory is being used by the OS disk cache, and 10.23GB of memory is >>>> free. Add those up, and it comes to 68.68GB ... but the machine only >>>> has 64GB of memory, and that total doesn't include the memory usage of >>>> the other processes seen in the screenshot. This impossible situation >>>> means that something is being misreported somewhere. If I deduct that >>>> 11GB of SHR from the RES value, then all the numbers work. >>>> >>>> The screenshot was almost 3 years ago, so I do not know what machine it >>>> came from, and therefore I can't be sure what the actual heap size was. >>>> I think it was about 6GB -- the difference between RES and SHR. I have >>>> used a 6GB heap on some of my production servers in the past. The >>>> server where I got this screenshot was not having any noticeable >>>> performance or memory problems, so I think that I can trust that the >>>> main numbers above the process list (which only come from the OS) are >>>> correct. >>>> >>>> Thanks, >>>> Shawn >>>> >>>> >> >> >> >> -- >> Regards, >> Shalin Shekhar Mangar. >>