Anyone else besides Shawn and me to reproduce this problem? Shawn contacted 
Oracle off-list but that was useless at best (attach JConsole, watch heap, etc).

Is this a real problem, just a bad reporting issue of the JVM and Linux?

Thanks,
Markus

 
 
-----Original message-----
> From:Markus Jelsma <markus.jel...@openindex.io>
> Sent: Thursday 24th August 2017 17:20
> To: solr-user@lucene.apache.org
> Subject: RE: Solr uses lots of shared memory!
> 
> Hello Bernd,
> 
> According to the man page, i should get a list of stuff in shared memory if i 
> invoke it with just a PID. Which shows a list of libraries that together 
> account for about 25 MB's shared memory usage. Accoring to ps and top, the 
> JVM uses 2800 MB shared memory (not virtual), that leaves 2775 MB unaccounted 
> for. Any ideas? Anyone else to reproduce it on a freshly restarted node?
> 
> Thanks,
> Markus
> 
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND   
>                                                                               
>                                  
> 18901 markus    20   0 14,778g 4,965g 2,987g S 891,1 31,7  20:21.63 java
> 
> 0x000055b9a17f1000      6K      /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
> 0x00007fdf1d314000      182K    
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libsunec.so
> 0x00007fdf1e548000      38K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libmanagement.so
> 0x00007fdf1e78e000      94K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnet.so
> 0x00007fdf1e9a6000      75K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnio.so
> 0x00007fdf5cd6e000      34K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libzip.so
> 0x00007fdf5cf77000      46K     /lib/x86_64-linux-gnu/libnss_files-2.24.so
> 0x00007fdf5d189000      46K     /lib/x86_64-linux-gnu/libnss_nis-2.24.so
> 0x00007fdf5d395000      90K     /lib/x86_64-linux-gnu/libnsl-2.24.so
> 0x00007fdf5d5ae000      34K     /lib/x86_64-linux-gnu/libnss_compat-2.24.so
> 0x00007fdf5d7b7000      187K    
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libjava.so
> 0x00007fdf5d9e6000      70K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libverify.so
> 0x00007fdf5dbf8000      30K     /lib/x86_64-linux-gnu/librt-2.24.so
> 0x00007fdf5de00000      90K     /lib/x86_64-linux-gnu/libgcc_s.so.1
> 0x00007fdf5e017000      1063K   /lib/x86_64-linux-gnu/libm-2.24.so
> 0x00007fdf5e320000      1553K   /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
> 0x00007fdf5e6a8000      15936K  
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
> 0x00007fdf5f5ed000      139K    /lib/x86_64-linux-gnu/libpthread-2.24.so
> 0x00007fdf5f80b000      14K     /lib/x86_64-linux-gnu/libdl-2.24.so
> 0x00007fdf5fa0f000      110K    /lib/x86_64-linux-gnu/libz.so.1.2.11
> 0x00007fdf5fc2b000      1813K   /lib/x86_64-linux-gnu/libc-2.24.so
> 0x00007fdf5fff2000      58K     
> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
> 0x00007fdf60201000      158K    /lib/x86_64-linux-gnu/ld-2.24.so
> 
> -----Original message-----
> > From:Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
> > Sent: Thursday 24th August 2017 15:39
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr uses lots of shared memory!
> > 
> > Just an idea, how about taking a dump with jmap and using
> > MemoryAnalyzerTool to see what is going on?
> > 
> > Regards
> > Bernd
> > 
> > 
> > Am 24.08.2017 um 11:49 schrieb Markus Jelsma:
> > > Hello Shalin,
> > > 
> > > Yes, the main search index has DocValues on just a few fields, they are 
> > > used for facetting and function queries, we started using DocValues when 
> > > 6.0 was released. Most fields are content fields for many languages. I 
> > > don't think it is going to be DocValues because the max shared memory 
> > > consumption is reduced my searching on fields fewer languages, and by 
> > > disabling highlighting, both not using DocValues.
> > > 
> > > But it tried the option regardless, and because i didn't know about it. 
> > > But it appears the option does exactly nothing. First is without any 
> > > configuration for preload, second is with preload=true, third is 
> > > preload=false
> > > 
> > > 14220 markus    20   0 14,675g 1,508g  62800 S   1,0  9,6   0:36.98 java
> > > 14803 markus    20   0 14,674g 1,537g  63248 S   0,0  9,8   0:34.50 java
> > > 15324 markus    20   0 14,674g 1,409g  63152 S   0,0  9,0   0:35.50 java
> > > 
> > > Please correct my config is i am wrong:
> > > 
> > >   <directoryFactory name="DirectoryFactory" 
> > > class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
> > >      <bool name="preload">false</bool>  
> > >   </directoryFactory>
> > > 
> > > NRTCachingDirectoryFactory implies MMapDirectory right?
> > > 
> > > Thanks,
> > > Markus
> > >  
> > > -----Original message-----
> > >> From:Shalin Shekhar Mangar <shalinman...@gmail.com>
> > >> Sent: Thursday 24th August 2017 5:51
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Re: Solr uses lots of shared memory!
> > >>
> > >> Very interesting. Do you have many DocValue fields? Have you always
> > >> had them i.e. did you see this problem before you turned on DocValues?
> > >> The DocValue fields are in a separate file and they will be memory
> > >> mapped on demand. One thing you can experiment with is to use
> > >> preload=true option on the MMapDirectoryFactory which will mmap all
> > >> index files on startup [1]. Once you do this, and if you still notice
> > >> shared memory leakage then it may be a genuine memory leak that we
> > >> should investigate.
> > >>
> > >> [1] - 
> > >> http://lucene.apache.org/solr/guide/6_6/datadir-and-directoryfactory-in-solrconfig.html#DataDirandDirectoryFactoryinSolrConfig-SpecifyingtheDirectoryFactoryForYourIndex
> > >>
> > >> On Wed, Aug 23, 2017 at 7:02 PM, Markus Jelsma
> > >> <markus.jel...@openindex.io> wrote:
> > >>> I do not think it is a problem of reporting after watching top after 
> > >>> restart of some Solr instances, it dropped back to `normal`, around 350 
> > >>> MB, which i think it high to but anyway.
> > >>>
> > >>> Two hours later, the restarted nodes are slowly increasing shared 
> > >>> memory consumption to about 1500 MB now. I don't understand why shared 
> > >>> memory usage should/would increase slowly over time, it makes little 
> > >>> sense to me and i cannot remember Solr doing this in the past ten years.
> > >>>
> > >>> But it seems to correlate to index size on disk, these main text search 
> > >>> nodes have an index of around 16 GB and up 3 GB of shared memory after 
> > >>> a few days. Logs nodes up to 800 MB index size and 320 MB of shared 
> > >>> memory, the low latency nodes have four different cores that make up 
> > >>> just over 100 MB index size, shared memory consumption is just 22 MB, 
> > >>> which seems more reasonable for the case of shared memory.
> > >>>
> > >>> I can also force Solr to 'leak' shared memory just by sending queries 
> > >>> to it. My freshly restarted local node used 68 MB shared memory at 
> > >>> startup. Two minutes and 25.000 queries later it was already 2748 MB! 
> > >>> At first there is a very sharp increase to 2000, then it takes almost 
> > >>> two minutes more to increase to 2748. I can decrease the maximum shared 
> > >>> memory usage to 1200 if i query (via edismax) only on fields of one 
> > >>> language instead of 25 orso. I can decrease it as well further if i 
> > >>> disable highlighting (HUH?) but still query on all fields.
> > >>>
> > >>> * We have tried patching Java's ByteBuffer [1] because it seemed to fit 
> > >>> the problems, it does not fix it.
> > >>> * We have also removed all our custom plugins, so it has become a 
> > >>> vanilla Solr 6.6 just with our stripped down schema and solrconfig, it 
> > >>> neither fixes it.
> > >>>
> > >>> Why does it slowly increase over time?
> > >>> Why does it appear to correlate to index size?
> > >>> Is anyone else seeing this on their 6.6 cloud production or local 
> > >>> machines?
> > >>>
> > >>> Thanks,
> > >>> Markus
> > >>>
> > >>> [1]: http://www.evanjones.ca/java-bytebuffer-leak.html
> > >>>
> > >>> -----Original message-----
> > >>>> From:Shawn Heisey <apa...@elyograg.org>
> > >>>> Sent: Tuesday 22nd August 2017 17:32
> > >>>> To: solr-user@lucene.apache.org
> > >>>> Subject: Re: Solr uses lots of shared memory!
> > >>>>
> > >>>> On 8/22/2017 7:24 AM, Markus Jelsma wrote:
> > >>>>> I have never seen this before, one of our collections, all nodes 
> > >>>>> eating tons of shared memory!
> > >>>>>
> > >>>>> Here's one of the nodes:
> > >>>>> 10497 solr      20   0 19.439g 4.505g 3.139g S   1.0 57.8   2511:46 
> > >>>>> java
> > >>>>>
> > >>>>> RSS is roughly equal to heap size + usual off-heap space + shared 
> > >>>>> memory. Virtual is equal to RSS and index size on disk. For two other 
> > >>>>> collections, the nodes use shared memory as expected, in the MB range.
> > >>>>>
> > >>>>> How can Solr, this collection, use so much shared memory? Why?
> > >>>>
> > >>>> I've seen this on my own servers at work, and when I add up a subset of
> > >>>> the memory numbers I can see from the system, it ends up being more
> > >>>> memory than I even have in the server.
> > >>>>
> > >>>> I suspect there is something odd going on in how Java reports memory
> > >>>> usage to the OS, or maybe a glitch in how Linux interprets Java's 
> > >>>> memory
> > >>>> usage.  At some point in the past, numbers were reported correctly.  I
> > >>>> do not know if the change came about because of a Solr upgrade, because
> > >>>> of a Java upgrade, or because of an OS kernel upgrade.  All three were
> > >>>> upgraded between when I know the numbers looked right and when I 
> > >>>> noticed
> > >>>> they were wrong.
> > >>>>
> > >>>> https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0
> > >>>>
> > >>>> This screenshot shows that Solr is using 17GB of memory, 41.45GB of
> > >>>> memory is being used by the OS disk cache, and 10.23GB of memory is
> > >>>> free.  Add those up, and it comes to 68.68GB ... but the machine only
> > >>>> has 64GB of memory, and that total doesn't include the memory usage of
> > >>>> the other processes seen in the screenshot.  This impossible situation
> > >>>> means that something is being misreported somewhere.  If I deduct that
> > >>>> 11GB of SHR from the RES value, then all the numbers work.
> > >>>>
> > >>>> The screenshot was almost 3 years ago, so I do not know what machine it
> > >>>> came from, and therefore I can't be sure what the actual heap size was.
> > >>>> I think it was about 6GB -- the difference between RES and SHR.  I have
> > >>>> used a 6GB heap on some of my production servers in the past.  The
> > >>>> server where I got this screenshot was not having any noticeable
> > >>>> performance or memory problems, so I think that I can trust that the
> > >>>> main numbers above the process list (which only come from the OS) are
> > >>>> correct.
> > >>>>
> > >>>> Thanks,
> > >>>> Shawn
> > >>>>
> > >>>>
> > >>
> > >>
> > >>
> > >> -- 
> > >> Regards,
> > >> Shalin Shekhar Mangar.
> > >>
> > 
> 

Reply via email to