> An 80GB heap is ENORMOUS.  And you have two of those per server.  Do you
> *know* that you need a heap that large?  You only have 50 million
> documents total, two instances that each have 80GB seems completely
> unnecessary.  I would think that one instance with a much smaller heap
> would handle just about anything you could throw at 50 million documents.

> With 160GB taken by heaps, you're leaving less than 100GB of memory to
> cache over 700GB of index.  This is not going to work well, especially
> if your index doesn't have many fields that are stored.  It will cause a
> lot of disk I/O.

We have 27 collections and each collection has many schema fields and in live 
too many search and index create&update requests come and most of the searching 
requests are sorting, faceting, grouping, and long query.
So approx average 40GB heap are used so we gave 80GB memory.

> Unless you have changed the DirectoryFactory to something that's not
> default, your process listing does not reflect over 700GB of index data.
> If you have changed the DirectoryFactory, then I would strongly
> recommend removing that part of your config and letting Solr use its
> default.

our directory in solrconfig.xml
<directoryFactory name="DirectoryFactory"
                    class="${solr.directoryFactory:solr.MMapDirectoryFactory}">
</directoryFactory>

Here our schema file and solrconfig XML and GC log, please verify it. is it 
anything wrong or suggestions for improvement?
https://drive.google.com/drive/folders/1wV9bdQ5-pP4s4yc8jrYNz77YYVRmT7FG


GC log ::
2019-06-06T11:55:37.729+0100: 1053781.828: [GC (Allocation Failure) 
1053781.828: [ParNew
Desired survivor size 3221205808 bytes, new threshold 8 (max 8)
- age   1:  268310312 bytes,  268310312 total
- age   2:  220271984 bytes,  488582296 total
- age   3:   75942632 bytes,  564524928 total
- age   4:   76397104 bytes,  640922032 total
- age   5:  126931768 bytes,  767853800 total
- age   6:   92672080 bytes,  860525880 total
- age   7:    2810048 bytes,  863335928 total
- age   8:   11755104 bytes,  875091032 total
: 15126407K->1103229K(17476288K), 15.7272287 secs] 
45423308K->31414239K(80390848K), 15.7274518 secs] [Times: user=212.05 
sys=16.08, real=15.73 secs]
Heap after GC invocations=68829 (full 187):
 par new generation   total 17476288K, used 1103229K [0x0000000080000000, 
0x0000000580000000, 0x0000000580000000)
  eden space 13981056K,   0% used [0x0000000080000000, 0x0000000080000000, 
0x00000003d5560000)
  from space 3495232K,  31% used [0x00000004aaab0000, 0x00000004ee00f508, 
0x0000000580000000)
  to   space 3495232K,   0% used [0x00000003d5560000, 0x00000003d5560000, 
0x00000004aaab0000)
 concurrent mark-sweep generation total 62914560K, used 30311010K 
[0x0000000580000000, 0x0000001480000000, 0x0000001480000000)
 Metaspace       used 50033K, capacity 50805K, committed 53700K, reserved 55296K
}
2019-06-06T11:55:53.456+0100: 1053797.556: Total time for which application 
threads were stopped: 42.4594545 seconds, Stopping threads took: 26.7301882 
seconds

For which reason GC paused 42 seconds?

Heavy searching and indexing create & update in our Solr Cloud.
So, Should we divide a cloud between 27 collections? Should we add one more 
shard?

Sent from Outlook<http://aka.ms/weboutlook>
________________________________
From: Shawn Heisey <apa...@elyograg.org>
Sent: Friday, June 7, 2019 9:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Query takes a long time Solr 6.1.0

On 6/6/2019 5:45 AM, vishal patel wrote:
> One server(256GB RAM) has two below Solr instance and other application also
> 1) shards1 (80GB heap ,790GB Storage, 449GB Indexed data)
> 2) replica of shard2 (80GB heap, 895GB Storage, 337GB Indexed data)
>
> The second server(256GB RAM and 1 TB storage) has two below Solr instance and 
> other application also
> 1) shards2 (80GB heap, 790GB Storage, 338GB Indexed data)
> 2) replica of shard1 (80GB heap, 895GB Storage, 448GB Indexed data)

An 80GB heap is ENORMOUS.  And you have two of those per server.  Do you
*know* that you need a heap that large?  You only have 50 million
documents total, two instances that each have 80GB seems completely
unnecessary.  I would think that one instance with a much smaller heap
would handle just about anything you could throw at 50 million documents.

With 160GB taken by heaps, you're leaving less than 100GB of memory to
cache over 700GB of index.  This is not going to work well, especially
if your index doesn't have many fields that are stored.  It will cause a
lot of disk I/O.

> Both server memory and disk usage:
> https://drive.google.com/drive/folders/11GoZy8C0i-qUGH-ranPD8PCoPWCxeS-5

Unless you have changed the DirectoryFactory to something that's not
default, your process listing does not reflect over 700GB of index data.
  If you have changed the DirectoryFactory, then I would strongly
recommend removing that part of your config and letting Solr use its
default.

> Note: Average 40GB heap used normally in each Solr instance. when replica 
> gets down at that time disk IO are high and also GC pause time above 15 
> seconds. We can not identify the exact issue of replica recovery OR down from 
> logs. due to the GC pause? OR due to disk IO high? OR due to time-consuming 
> query? OR due to heavy indexing?

With an 80GB heap, I'm not really surprised you're seeing GC pauses
above 15 seconds.  I have seen pauses that long with a heap that's only 8GB.

GC pauses lasting that long will cause problems with SolrCloud.  Nodes
going into recovery is common.

Thanks,
Shawn

Reply via email to