Have you changed any of the merge policy parameters? I doubt it but just asking.

My guess: your I/O is your bottleneck. There are a limited number of
threads (tunable) that are used for background merging. When they're
all busy, incoming updates are queued up. This squares with your
statement that queries are fine and CPU activity is moderate.

A quick test there would be to try this on a non-AWS setup if you have
some hardware you can repurpose.

an 80G heap is a red flag. Most of the time that's too large by far.
So one thing I'd do is hook up some GC monitoring, you may be spending
a horrible amount of time in GC cycles.

Best,
Erick

On Thu, Apr 19, 2018 at 8:23 AM, Denis Demichev <demic...@gmail.com> wrote:
>
> All,
>
> I would like to request some assistance with a situation described below. My
> SolrCloud cluster accepts the update requests at a very low pace making it
> impossible to index new documents.
>
> Cluster Setup:
> Clients - 4 JVMs, 4 threads each, using SolrJ to submit data
> Cluster - SolrCloud 7.2.1, 10 instances r4.4xlarge, 120GB physical memory,
> 80GB Java Heap space, AWS
> Java - openjdk version "1.8.0_161" OpenJDK Runtime Environment (build
> 1.8.0_161-b14) OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
> Zookeeper - 3 standalone nodes on t2.large running under Exhibitor
>
> Symptoms:
> 1. 4 instances running 4 threads each are using SolrJ client to submit
> documents to SolrCloud for indexing, do not perform any manual commits. Each
> document  batch is 10 documents big, containing ~200 text fields per
> document.
> 2. After some time (~20-30 minutes, by that time I see only ~50-60K of
> documents in a collection, node restarts do not help) I notice that clients
> cannot submit new documents to the cluster for indexing anymore, each
> operation takes enormous amount of time
> 3. Cluster is not loaded at all, CPU consumption is moderate (I am seeing
> that merging is performed all the time though), memory consumption is
> adequate, but still updates are not accepted from external clients
> 4. Search requests are handled fine
> 5. I don't see any significant activity in SolrCloud logs anywhere, just
> regular replication attempts only. No errors.
>
>
> Additional information
> 1. Please see Thread Dump attached.
> 2. Please see SolrAdmin info with physical memory and file descriptor
> utilization
> 3. Please see VisualVM screenshots with CPU and memory utilization and CPU
> profiling data. Physical memory utilization is about 60-70 percent all the
> time.
> 4. Schema file contains ~10 permanent fields 5 of which are mapped and
> mandatory and persisted, the rest of the fields are optional and dynamic
> 5. Solr config configures autoCommit to be set to 2 minutes and openSearcher
> set to false
> 6. Caches are set up with autoWarmCount = 0
> 7. GC was fine tuned and I don't see any significant CPU utilization by GC
> or any lengthy pauses. Majority of the garbage is collected in young gen
> space.
>
> My primary question: I see that the cluster is alive and performs some
> merging and commits but does not accept new documents for indexing. What is
> causing this slowdown and why it does not accept new submissions?
>
>
> Regards,
> Denis

Reply via email to