More on this, I think I found something...
*Slave admin console- --> stats.jsp#cache**, FieldCache**
*
...
entries count: 22
entry#0 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/*_mp*.frq")'=>*'title_sort'*,class
...
entry#9 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/*_mr*.frq")'=>*'title_sort'*,class
...
entry#14 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/slave-data-dir/cbt/main/data/index/*_mn*.frq")'=>*'title_sort'*,class
The data directory (on the slave) doesn't contain *_mn.** and *_mp.**
but only *_mr.**!
> ls -la
drwxr-xr-x 2 agazzarini agazzarini 4096 Aug 13 18:17 .
drwxr-xr-x 3 agazzarini agazzarini 4096 Aug 13 16:32 ..
-rw-r--r-- 1 agazzarini agazzarini 8184675552 Aug 13 16:17 _mr.fdt
-rw-r--r-- 1 agazzarini agazzarini 14019212 Aug 13 16:17 _mr.fdx
-rw-r--r-- 1 agazzarini agazzarini 4957 Aug 13 16:08 _mr.fnm
-rw-r--r-- 1 agazzarini agazzarini 904512239 Aug 13 16:19 _mr.frq
-rw-r--r-- 1 agazzarini agazzarini 340972819 Aug 13 16:19 _mr.prx
-rw-r--r-- 1 agazzarini agazzarini 14154155 Aug 13 16:19 _mr.tii
-rw-r--r-- 1 agazzarini agazzarini 820714274 Aug 13 16:19 _mr.tis
-rw-r--r-- 1 agazzarini agazzarini 3504631 Aug 13 16:20 _mr.tvd
-rw-r--r-- 1 agazzarini agazzarini 288506509 Aug 13 16:20 _mr.tvf
-rw-r--r-- 1 agazzarini agazzarini 28038420 Aug 13 16:20 _mr.tvx
-rw-r--r-- 1 agazzarini agazzarini 20 Aug 13 13:53 segments.gen
-rw-r--r-- 1 agazzarini agazzarini 287 Aug 13 16:20 segments_i
*On the master node, I have mp and mr segments...**(why? is something
like a commit point? if so, why the SLAVE admin console show me
something about the mp segment? )*
-rw-r--r-- 1 agazzarini agazzarini 8184675552 Aug 13 14:24 _mp.fdt
-rw-r--r-- 1 agazzarini agazzarini 14019212 Aug 13 14:24 _mp.fdx
-rw-r--r-- 1 agazzarini agazzarini 4957 Aug 13 14:18 _mp.fnm
-rw-r--r-- 1 agazzarini agazzarini 904512316 Aug 13 14:26 _mp.frq
-rw-r--r-- 1 agazzarini agazzarini 340972819 Aug 13 14:26 _mp.prx
-rw-r--r-- 1 agazzarini agazzarini 14154155 Aug 13 14:26 _mp.tii
-rw-r--r-- 1 agazzarini agazzarini 820714274 Aug 13 14:26 _mp.tis
-rw-r--r-- 1 agazzarini agazzarini 3504631 Aug 13 14:26 _mp.tvd
-rw-r--r-- 1 agazzarini agazzarini 288506509 Aug 13 14:26 _mp.tvf
-rw-r--r-- 1 agazzarini agazzarini 28038420 Aug 13 14:26 _mp.tvx
-rw-r--r-- 1 agazzarini agazzarini 8184675552 Aug 13 16:17 _mr.fdt
-rw-r--r-- 1 agazzarini agazzarini 14019212 Aug 13 16:17 _mr.fdx
-rw-r--r-- 1 agazzarini agazzarini 4957 Aug 13 16:08 _mr.fnm
-rw-r--r-- 1 agazzarini agazzarini 904512239 Aug 13 16:19 _mr.frq
-rw-r--r-- 1 agazzarini agazzarini 340972819 Aug 13 16:19 _mr.prx
-rw-r--r-- 1 agazzarini agazzarini 14154155 Aug 13 16:19 _mr.tii
-rw-r--r-- 1 agazzarini agazzarini 820714274 Aug 13 16:19 _mr.tis
-rw-r--r-- 1 agazzarini agazzarini 3504631 Aug 13 16:20 _mr.tvd
-rw-r--r-- 1 agazzarini agazzarini 288506509 Aug 13 16:20 _mr.tvf
-rw-r--r-- 1 agazzarini agazzarini 28038420 Aug 13 16:20 _mr.tvx
-rw-r--r-- 1 agazzarini agazzarini 287 Aug 13 14:26 segments_g
-rw-r--r-- 1 agazzarini agazzarini 20 Aug 13 16:20 segments.gen
-rw-r--r-- 1 agazzarini agazzarini 287 Aug 13 16:20 segments_i
if I execute a query sorting by title_sort, on the admin page (#cache) I
see the field cache populated:
entry count 1
entry#0 :
'MMapIndexInput(path="/home/agazzarini/solr-indexes/master-data-dir/cbt/main/data/index/_mr.frq")'=>'title_sort',class
So,
1. mr.* is the only segment I have on the slave...and I would expect to
find only that in the slave
2. mp.* is in the data dir of the master and I see something that
refers to it in the slave admin console...coudl be the reason of
title_sort doubled in memory
3. mn.* boh? what is this? FieldCacheImpl has another third reference
of title sort values for this mn
On 08/13/2013 05:51 PM, Andrea Gazzarini wrote:
Hi,
I'm getting some Out of memory (heap space) from my solr instance and
after investigating a little bit, I found several threads about
sorting behaviour in SOLR.
First, some information about the environment
- I'm using SOLR 3.6.1 and master / slave architecture with 1 master
and 2 slaves.
- All of them have Xms and Xmx set to 4GB, index is about 10GB for
about 1.800.000 documents.
- Indexes are updated (and therefore replicated once in a day)
After the first OOM I saw the corresponding dump on Memory Analyzer
and I found a BIG /org.apache.lucene.search.FieldCacheImpl /instance
(more than 2GB)...I exploded its internal structure and realized that
I had a lot of long long sort fields (book titles which were composed
by title + subtitle + author concatenated)...so, what I did? basically
I reduced the length of that field (now is composed only by the first
title) so now I have a more limited number of unique fields.
Now, 5 hours ago
- I took the production SOLR log and I extracted something about
20.000 (real) queries
- I started the master, slaves and reindexed all documents, after a
little index has been replicated on slaves.
- I started solrmeter that is randonmly querying slaves (using the
extracted queries)
- After two hours memory comsuption peak was (jvisualvm) about 2GB,
every (moreless) 5 minutes GC freed about 500GB...constantly.
- I indexed 4000 documents, 10 minutes after replication the whole
memory consumption has been completely translated up...min peak 2GB
min, 2.6GB max.
- After two hours I indexed other documents (4000) and now I have a
min peak of 2.6GB and a max of 3.4GB...and is still slowly growing...
Note that the number of newly indexed documents is not so relevant
(4000 on a total of 1.800.000)
Now, using JConsole I see
- a PS Eden space which is periodically clean (it's responsible of the
wave between the min and the max usage)
- a PS Survivor space which is very low (16MB)
- a PS Old Gen which is set to 2.6GB and it's growing, very slowly but
it's still growing...
Now, the question...
I generated another dump and, as expected, the most part of the usage
is still in /org.apache.lucene.search.FieldCacheImpl. /Of course, the
size is about 980MB (initially it was more than 2GB) which seems good
(at least better than the initial situation). The most part of those
980MB are still occupied by sort fields
What I'm not understanding is how sort fields are loaded in memory...
I mean, I read that in order to optimize sorting, SOLR needs to load
all values of sort fields, ok, that's good. But why I see several
WeakHashMaps that contains different Entry references with the same
sort field (and its values)?
For example for title_sort (unique values are 1.432.000) I have two
(different, is not the same reference) Entry objects with a
- key "title_sort"
- and a value (org.apache.lucene.search.FieldCache$StringIndex) which
has a int array [1.432.000] and a String array with moreless the same size
So the memory usage (in this case) is doubled...are sort field values
loaded in memory more than once? How many times?
Best and as usual, sorry for the long email
Andrea