Re: Solr eats up all the memory

Christopher Schultz Wed, 06 Jul 2022 15:38:19 -0700

Shawn,

On 7/5/22 23:52, Shawn Heisey wrote:

On 7/5/2022 3:11 PM, Christopher Schultz wrote:
Well, if you need more than 32GiB, I think the recommendation is to goMUCH HIGHER than 32GiB. If you have a 48GiB machine, maybe restrict to31GiB of heap, but if you have a TiB, go for it :)
I remember reading somewhere, likely for a different program than Solr,that the observed break-even point for 64-bit pointers was 46GB. Thelevel of debugging and introspection required to calculate that numberwould be VERY extensive. Most Solr installs can get by with a max heapsize of 31GB or less, even if they are quite large. For those that needmore, I would probably want to see a heap size of at least 64GB. It isprobably better to use SolrCloud and split the index across more serversto keep the heap requirement low than to use a really massive heap.
This is why I said "uhh..." above: the JVM needs more memory than theheap. Sometimes as much as twice that amount, depending upon theworkload of the application itself. Measure, measure, measure.
It would be interesting to see how much overhead there really is forSolr with various index sizes. We have seen people have OOM problemswhen making *only* GC changes ... switching from CMS to G1. Solr hasused G1 out of the box for a while now.


Anecdotal data point:

Solr 7.7.3
Oracle Java 1.8.0_312
Xms = Xmx = 1024M
No messing with default GC or other memory settings
1 Core, no ZK
30s autocommit

On-disk artifact size:
$ du -hs /path/to/core
723M    /path/to/core

Live memory info:

Solr self-reported heap memory used: 205.12 MB [*]

I reloaded the admin page after writing the "*" note below and it'sreporting 55.78 MB heap used.


Using 'ps' to report real memory usage:

$ ps aux | grep '\(java\|PID\)'
USER       PID %CPU %MEM    VSZ   RSS     [...]
solr     20324  8.1  0.7 6928440 469496   [...]

So the process space is 6.6G (my 'ps' reports VSZ in kilobytes) and theresident size (aka "actual memory use") is ~460M.

Solr doesn't report the high-water mark for its heap usage, but the mostI've seen so far without a GC kicking it back down is ~200M. So therelooks to be about 100% overhead based upon the max heap size.

I see lots of memory mapped files (both JAR libraries and index-relatedfiles) when I do:


$ sudo lsof -p 20324

So I suspect a lot of those are mapped-into that resident process space.mmap is one of those things that eats-up tons of non-heap space anddoesn't count toward that Xms/Xmx limit. Probably why people run out ofmemory so frequently because they think they can allocate huge amountsof heap space on their big machine when they really need native memoryand not quite so much heap.

[*] I recently restarted Solr because my personal TLS client key hadexpired; I had to mint a new one and install it. I'd really love to knowif Solr/Jetty can re-load its TLS configuration without restarting. It'sa real drag to bounce Solr for something so mundane.

I'm in interested to know what the relation is between on-disk indexside and in-memory index size. I would imagine that the on-diskartifacts are fairly slim (only storing what is necessary) and thein-memory representation has all kinds of "waste" (like pointers andall that). Has anyone done a back-of-the-napkin calculation to guessat the in-memory size of an index given the on-disk representation?
That is an interesting question. One of the reasons Lucene queries sofast when there is plenty of memory is because it accesses files on diskdirectly with MMAP, so there is no need to copy the really massive datastructures into the heap at all.

This is likely where lots of that RSS space is being used in my processdetailed above.

I believe the OP is having problems because they need a total memorysize far larger than 64GB to handle 500GB of index data, and they shouldalso have dedicated hardware for Solr so there is no competition withother software for scarce system resources.

Having never come close to busting my heap with my tiny 500M (on-disk)index, I'm curious about Solr's expected performance with a huge indexand small memory. Will Solr just "get by with what it has" or will itreally crap itself if the index is too big? I was kinda hoping it wouldjust perform awfully because it has to keep going back to the disk.


-chris

Re: Solr eats up all the memory

Reply via email to