Hello Bernd,
Thanks for the suggestion,
the problem is that we don't have 75 GB of RAM.
Are you aware of any way to reduce solr memory usage?
Thanks
Danilo
On 04/12/18 15:06, Bernd Fehling wrote:
Hi Danilo,
Full GC points out that you need more heap which also implies that you
need more RAM.
Raise your heap to 24GB and your physical RAM to about 75GB or better
96GB.
RAM should be about 3 to 4 times heap size.
Regards, Bernd
Am 04.12.18 um 13:37 schrieb Danilo Tomasoni:
Hello Bernd,
Here I list the extra info you requested:
- actually the virtual machine has 22GB of RAM and 16GB of heap
- my 40 million raw data takes about 1364GB on filesystem (in xml
format)
- my index optimized (1 segment, 0 deleted docs) takes about 555GB
- solr 7.3, openjdk 1.8.0_181
- GC logs are like
2018-12-03T07:40:22.302+0100: 28752.505: [Full GC (Allocation
Failure) 2018-12-03T07:40:22.302+0100: 28752.505: [CMS:
12287999K->12287999K(12288000K), 13.6470083 secs]
15701375K->15701373K(15701376K), [Metaspace:
37438K->37438K(1083392K)], 13.6470726 secs] [Times: user=13.66
sys=0.00, real=13.64 secs]
Heap after GC invocations=2108 (full 1501):
par new generation total 3413376K, used 3413373K
[0x00000003d8000000, 0x00000004d2000000, 0x00000004d2000000)
eden space 2730752K, 99% used [0x00000003d8000000,
0x000000047eabfdc0, 0x000000047eac0000)
from space 682624K, 99% used [0x000000047eac0000,
0x00000004a855f8a0, 0x00000004a8560000)
to space 682624K, 0% used [0x00000004a8560000,
0x00000004a8560000, 0x00000004d2000000)
concurrent mark-sweep generation total 12288000K, used 12287999K
[0x00000004d2000000, 0x00000007c0000000, 0x00000007c0000000)
Metaspace used 37438K, capacity 38438K, committed 38676K,
reserved 1083392K
class space used 4257K, capacity 4521K, committed 4628K,
reserved 1048576K
}
Thank you for your help
Danilo
On 03/12/18 10:36, Bernd Fehling wrote:
Hi Danilo,
you have to give more infos about your system and the config.
- 30gb RAM (physical RAM?) how much heap do you have for JAVA?
- how large (in GByte) are your 40 million raw data being indexed?
- how large is your index (in GByte) with 40 million docs indexed?
- which version of Solr and JAVA?
- do you have JAVA garbage collection logs and if so what are they
reporting?
- Any FullGC in GC logs?
Regards, Bernd
Am 03.12.18 um 10:09 schrieb Danilo Tomasoni:
Hello all,
We have a configuration with a single node with 30gb of RAM.
We use it to index ~40MLN of documents.
We perform queries with edismax parser that contain often edismax
parser subqueries with the syntax
'_query_:{!edismax mm=X v=$subqueryN}'
Often X == 1.
This solves the "too many boolean clauses" error we got expanding
the query terms (often phrase queries) directly in the main query.
Unfortunately in this scenario solr often crashes while performing
a query, even with a single query and no other source of system load.
Do you have any idea of what's going on here?
Otherwise,
What kind of solr configuration parameters do you think I need to
investigate first?
What kind of log lines should I search for to understand what's
going on?
Thank you
Danilo
--
Danilo Tomasoni
COSBI
As for the European General Data Protection Regulation 2016/679 on the
protection of natural persons with regard to the processing of personal data,
we inform you that all the data we possess are object of treatement in the
respect of the normative provided for by the cited GDPR.
It is your right to be informed on which of your data are used and how; you may
ask for their correction, cancellation or you may oppose to their use by
written request sent by recorded delivery to The Microsoft Research –
University of Trento Centre for Computational and Systems Biology Scarl, Piazza
Manifattura 1, 38068 Rovereto (TN), Italy.