3x heap is larger than usual, but significant RAM beyond heap is a good
idea if you can't fit the whole index in 31 GB of memory, since the OS will
cache files in ram. Note also the use of 32 GB through about 45 GB heap
settings gives you LESS heap than 31 GB due to an increase in pointer sizes
needed to track large memory spaces. Typically 64 GB ram with 31gb heap is
a good start for decent sized indexes and add more machines to get more
ram/heap/cpu relative to your data on disk and query load. Of course test
and tune from there to find your ideal spec for your installation... Also
larger ram means longer gc pauses.

That said, none of the ram beyond heap is likely to have much effect on
crashing once the OS and other processes on the box are happy.

On Wed, Dec 5, 2018, 11:11 AM Walter Underwood <wun...@wunderwood.org wrote:

> I’ve never heard a recommendation to have three times as much RAM as the
> heap. That doesn’t make sense to me.
>
> You might need 3X as much disk space as the index size.
>
> For RAM, it is best to have the sum of:
>
> * JVM heap
> * A couple of gigabytes for OS and demons
> * RAM for other processes needed on the host (keep to a minimum)
> * Enough RAM to hold the entire index
>
> Clearly, you are not going to have enough RAM for a 555 gigabyte index.
> Well, Amazon does have a dozen instance types that can do that, but they
> are expensive.
>
> A 24 GB heap on a 30 GB machine will be pretty tight.
>
> Always set Xms (starting heap) to the same as Xmx (maximum heap). If you
> set it smaller, the JVM will keep increasing the heap until it hits the max
> before doing a full GC. It will always end up with the max setting, but it
> will have to do more work to get there. The setting for initial heap size
> is about the most useless thing in Java.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Dec 4, 2018, at 6:06 AM, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de> wrote:
> >
> > Hi Danilo,
> >
> > Full GC points out that you need more heap which also implies that you
> need more RAM.
> > Raise your heap to 24GB and your physical RAM to about 75GB or better
> 96GB.
> > RAM should be about 3 to 4 times heap size.
> >
> > Regards, Bernd
> >
> >
> > Am 04.12.18 um 13:37 schrieb Danilo Tomasoni:
> >> Hello Bernd,
> >> Here I list the extra info you requested:
> >> - actually the virtual machine has 22GB of RAM and 16GB of heap
> >> - my 40 million raw data takes about 1364GB on filesystem (in xml
> format)
> >> - my index optimized (1 segment, 0 deleted docs) takes about 555GB
> >> - solr 7.3, openjdk 1.8.0_181
> >> - GC logs are like
> >> 2018-12-03T07:40:22.302+0100: 28752.505: [Full GC (Allocation Failure)
> 2018-12-03T07:40:22.302+0100: 28752.505: [CMS:
> 12287999K->12287999K(12288000K), 13.6470083 secs]
> 15701375K->15701373K(15701376K), [Metaspace: 37438K->37438K(1083392K)],
> 13.6470726 secs] [Times: user=13.66 sys=0.00, real=13.64 secs]
> >> Heap after GC invocations=2108 (full 1501):
> >>  par new generation   total 3413376K, used 3413373K
> [0x00000003d8000000, 0x00000004d2000000, 0x00000004d2000000)
> >>   eden space 2730752K,  99% used [0x00000003d8000000,
> 0x000000047eabfdc0, 0x000000047eac0000)
> >>   from space 682624K,  99% used [0x000000047eac0000,
> 0x00000004a855f8a0, 0x00000004a8560000)
> >>   to   space 682624K,   0% used [0x00000004a8560000,
> 0x00000004a8560000, 0x00000004d2000000)
> >>  concurrent mark-sweep generation total 12288000K, used 12287999K
> [0x00000004d2000000, 0x00000007c0000000, 0x00000007c0000000)
> >>  Metaspace       used 37438K, capacity 38438K, committed 38676K,
> reserved 1083392K
> >>   class space    used 4257K, capacity 4521K, committed 4628K, reserved
> 1048576K
> >> }
> >> Thank you for your help
> >> Danilo
> >> On 03/12/18 10:36, Bernd Fehling wrote:
> >>> Hi Danilo,
> >>>
> >>> you have to give more infos about your system and the config.
> >>>
> >>> - 30gb RAM (physical RAM?) how much heap do you have for JAVA?
> >>> - how large (in GByte) are your 40 million raw data being indexed?
> >>> - how large is your index (in GByte) with 40 million docs indexed?
> >>> - which version of Solr and JAVA?
> >>> - do you have JAVA garbage collection logs and if so what are they
> reporting?
> >>> - Any FullGC in GC logs?
> >>>
> >>> Regards, Bernd
> >>>
> >>>
> >>> Am 03.12.18 um 10:09 schrieb Danilo Tomasoni:
> >>>> Hello all,
> >>>>
> >>>> We have a configuration with a single node with 30gb of RAM.
> >>>>
> >>>> We use it to index ~40MLN of documents.
> >>>>
> >>>> We perform queries with edismax parser that contain often edismax
> parser subqueries with the syntax
> >>>>
> >>>> '_query_:{!edismax mm=X v=$subqueryN}'
> >>>>
> >>>> Often X == 1.
> >>>>
> >>>> This solves the "too many boolean clauses" error we got expanding the
> query terms (often phrase queries) directly in the main query.
> >>>>
> >>>> Unfortunately in this scenario solr often crashes while performing a
> query, even with a single query and no other source of system load.
> >>>>
> >>>>
> >>>> Do you have any idea of what's going on here?
> >>>>
> >>>> Otherwise,
> >>>>
> >>>> What kind of solr configuration parameters do you think I need to
> investigate first?
> >>>>
> >>>> What kind of log lines should I search for to understand what's going
> on?
> >>>>
> >>>>
> >>>> Thank you
> >>>>
> >>>> Danilo
> >>>>
> >
> > --
> > *************************************************************
> > Bernd Fehling                    Bielefeld University Library
> > Dipl.-Inform. (FH)                LibTec - Library Technology
> > Universitätsstr. 25                  and Knowledge Management
> > 33615 Bielefeld
> > Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de
> >          https://www.ub.uni-bielefeld.de/~befehl/
> >
> > BASE - Bielefeld Academic Search Engine - www.base-search.net
> > *************************************************************
>
>

Reply via email to