Hi Shawn,
I can try to help you with the test.
I have a 6 solr node cluster ( machines with 4 cores and 28GB RAM, 250 GB
hard disk ) running on OpenJDK 11.0.11) having 2 shards and 3 replica's
each.

Currently, the cluster has 27GB of data per core, I can ingest more data to
make it around 100GB per core.
The nodes have 20GB heap as of now, will change it to 4 GB for the test.

Here is the current GC settings from my cluster, please let me know if we
need to change anything before the test part from heap size?

-XX:+AggressiveOpts-XX:+HeapDumpOnOutOfMemoryError
-XX:+ParallelRefProcEnabled-XX:+PerfDisableSharedMem-XX:+UseG1GC
-XX:+UseLargePages-XX:-OmitStackTraceInFastThrow-XX:ConcGCThreads=4
-XX:G1ReservePercent=18-XX:HeapDumpPath=/app/solrdata8/logs/heapdump
-XX:InitiatingHeapOccupancyPercent=50-XX:MaxGCPauseMillis=250
-XX:MaxNewSize=4G-XX:OnOutOfMemoryError=/app/solr8/bin/oom_solr.sh 8983
/app/solrdata8/logs-XX:ParallelGCThreads=8
-Xlog:gc*:file=/app/solrdata8/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Xms20g-Xmx20g-Xss256k


On Tue, Oct 12, 2021 at 2:24 AM Shawn Heisey <[email protected]> wrote:

> I would like to request help from the community on something.  I'm not
> in a position to do the kind of testing that I want, as I no longer have
> access to Solr servers with large amounts of data.
>
> What I want to test is the Sheandoah garbage collector.  I've done some
> testing on my own, but the index is very small (629MB) and so is the
> heap size (512MB).
>
> Here is a GC log from my most recent test:
>
> https://www.dropbox.com/s/8cbncuax7kv0x9c/solr_gc.log?dl=0
>
> For this test, I deleted all the GC logs, restarted Solr, deleted all
> docs and optimized the index so it had 0 segments, and then asked
> dovecot (POP/IMAP server) to do a full reindex.  At this moment there
> are 158905 docs in the index.  Then I grabbed the GC log linked above
> and had the gceasy.io website analyze it.  The GC performance looks very
> good ... but with the heap at only 512MB, even a bad GC config would
> probably look good.  Here are the GC settings that I put in
> /etc/default/solr.in.sh:
>
> GC_TUNE=" \
>    -XX:+AlwaysPreTouch \
>    -XX:+UseNUMA \
>    -XX:+UseShenandoahGC \
>    -XX:+ParallelRefProcEnabled \
>    -XX:+UseStringDeduplication \
>    -XX:ParallelGCThreads=2 \
> "
>
> I'm running this on a t3a.medium EC2 instance, which only has 2 CPUs, so
> I limited the GC threads to 2.  This instance is my personal mail
> server.  If anyone brave enough to help me test wants to try it, and you
> have a server with a LOT of cores, you could increase the number of
> threads.
>
> What I need to see is the GC logs that Solr creates, along with some
> details about the indexes on the server that generated the log.  Best
> results will come from very busy servers that have a large index ...
> hoping for 100GB or more of index per Solr core, and a max heap size at
> least 4GB.  If you want to get really adventurous, you could gather GC
> logs with the default GC settings (which in later Solr versions is G1GC)
> and with Shenandoah.
>
> A recent version of Java 11 is required to enable the Shenandoah
> collector.  I think it was made available in 11.0.3.  I am running
> OpenJDK 11.0.11, the latest available on Ubuntu 20.04 LTS.
>
> I'm not advocating that anyone try this on a mission-critical production
> system, but I would not expect it to cause problems on such a setup.
> Use your own judgement.
>
> Thanks,
> Shawn
>
>

-- 
Best Regards,
Dinesh Naik

Reply via email to