16, 2013 9:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr using a ridiculous amount of memory
It was interesting to read this post. I had similar issue on Solr v4.2.1.
The
nature of our document is that it has huge multiValued fields and we were
able to knock off out server in about 30mu
It was interesting to read this post. I had similar issue on Solr v4.2.1. The
nature of our document is that it has huge multiValued fields and we were
able to knock off out server in about 30muns
We then found a bug "Lucene-4995" which was causing all the problem.
Applying the patch has helped a
John:
If you'd like to add your experience to the Wiki, create
an ID and let us know what it is and we'll add you to the
contributors list. Unfortunately we had problems with
spam pages to we added this step.
Make sure you include your logon in the request.
Thanks,
Erick
On Fri, Jun 14, 2013 at
On Fri, 2013-06-14 at 14:55 +0200, John Nielsen wrote:
> Sorry for not getting back to the list sooner.
Time not important, only feedback important (apologies to Fifth
Element).
> After some major refactoring, our 15 cores have now turned into ~500 cores
> and our memory consumption has dropped d
Sorry for not getting back to the list sooner. It seems like I finally
solved the memory problems by following Toke's instruction of splitting the
cores up into smaller chunks.
After some major refactoring, our 15 cores have now turned into ~500 cores
and our memory consumption has dropped dramati
Hmmm. There has been quite a bit of work lately to support a couple of
things that might be of interest (4.3, which Simon cut today, probably
available to all mid next week at the latest). Basically, you can
choose to pre-define all the cores in solr.xml (so-called "old style")
_or_ use the new-sty
> You are missing an essential part: Both the facet and the sort
> structures needs to hold one reference for each document
> _in_the_full_index_, even when the document does not have any values in
> the fields.
>
Wow, thank you for this awesome explanation! This is where the penny
dropped for me.
On Thu, 2013-04-18 at 11:59 +0200, John Nielsen wrote:
> Yes, thats right. No search from any given client ever returns
> anything from another client.
Great. That makes the 1 core/client solution feasible.
[No sort & facet warmup is performed]
[Suggestion 1: Reduce the number of sort fields by
>
> > http://172.22.51.111:8000/solr/default1_Danish/search
>
> [...]
>
> > &fq=site_guid%3a(10217)
>
> This constraints to hits to a specific customer, right? Any search will
> only be in a single customer's data?
>
Yes, thats right. No search from any given client ever returns anything
from anot
On Thu, 2013-04-18 at 08:34 +0200, John Nielsen wrote:
>
[Toke: Can you find the facet fields in any of the other caches?]
> Yes, here it is, in the field cache:
> http://screencast.com/t/mAwEnA21yL
>
Ah yes, mystery solved, my mistake.
> http://172.22.51.111:8000/solr/default1_Danish/search
> That was strange. As you are using a multi-valued field with the new
setup, they should appear there.
Yes, the new field we use for faceting is a multi valued field.
> Can you find the facet fields in any of the other caches?
Yes, here it is, in the field cache:
http://screencast.com/t/mAwEnA
Whopps. I made some mistakes in the previous post.
Toke Eskildsen [t...@statsbiblioteket.dk]:
> Extrapolating from 1.4M documents and 180 clients, let's say that
> there are 1.4M/180/5 unique terms for each sort-field and that their
> average length is 10. We thus have
> 1.4M*log2(1500*10*8) + 1
John Nielsen [j...@mcb.dk]:
> I never seriously looked at my fieldValueCache. It never seemed to get used:
> http://screencast.com/t/YtKw7UQfU
That was strange. As you are using a multi-valued field with the new setup,
they should appear there. Can you find the facet fields in any of the other
> I am surprised about the lack of "UnInverted" from your logs as it is
logged on INFO level.
Nope, no trace of it. No mention either in Logging -> Level from the admin
interface.
> It should also be available from the admin interface under
collection/Plugin / Stats/CACHE/fieldValueCache.
I neve
John Nielsen [j...@mcb.dk] wrote:
> I managed to get this done. The facet queries now facets on a multivalue
> field as opposed to the dynamic field names.
> Unfortunately it doesn't seem to have done much difference, if any at all.
I am sorry to hear that.
> documents = ~1.400.000
> references
I managed to get this done. The facet queries now facets on a multivalue
field as opposed to the dynamic field names.
Unfortunately it doesn't seem to have done much difference, if any at all.
Some more information that might help:
The JVM memory seem to be eaten up slowly. I dont think that the
Might be obvious, but just in case - remember that you'll need to
re-index your content once you've added docValues to your schema, in
order to get the on-disk files to be created.
Upayavira
On Mon, Mar 25, 2013, at 03:16 PM, John Nielsen wrote:
> I apologize for the slow reply. Today has been ki
I did a search. I have no occurrence of "UnInverted" in the solr logs.
> Another explanation for the large amount of memory presents itself if
> you use a single index: If each of your clients facet on at least one
> fields specific to the client ("client123_persons" or something like
> that), the
On Mon, 2013-04-15 at 10:25 +0200, John Nielsen wrote:
> The FieldCache is the big culprit. We do a huge amount of faceting so
> it seems right.
Yes, you wrote that earlier. The mystery is that the math does not check
out with the description you have given us.
> Unfortunately I am super swamped
Yes and no,
The FieldCache is the big culprit. We do a huge amount of faceting so it
seems right. Unfortunately I am super swamped at work so I have precious
little time to work on this, which is what explains my silence.
Out of desperation, I added another 32G of memory to each server and
increa
On Sun, 2013-03-24 at 09:19 +0100, John Nielsen wrote:
> Our memory requirements are running amok. We have less than a quarter of
> our customers running now and even though we have allocated 25GB to the JVM
> already, we are still seeing daily OOM crashes.
Out of curiosity: Did you manage to pinp
I apologize for the slow reply. Today has been killer. I will reply to
everyone as soon as I get the time.
I am having difficulties understanding how docValues work.
Should I only add docValues to the fields that I actually use for sorting
and faceting or on all fields?
Will the docValues magic
nt: Sunday, March 24, 2013 2:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr using a ridiculous amount of memory
Just to get started, do you hit OOM quickly with a few expensive queries, or
is it after a number of hours and lots of queries?
Does Java heap usage seem to be growing linearly
Toke Eskildsen [t...@statsbiblioteket.dk]:
> If your whole index has 10M documents, which each has 100 values
> for each field, with each field having 50M unique values, then the
> memory requirement would be more than
> 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~=
> 1.6GB for face
From: John Nielsen [j...@mcb.dk]:
> The index is about 35GB on disk with each register between 15k and 30k.
> (This is simply the size of a full xml reply of one register. I'm not sure
> how to measure it otherwise.)
> Our memory requirements are running amok. We have less than a quarter of
> our
On Sun, Mar 24, 2013 at 4:19 AM, John Nielsen wrote:
> Schema with DocValues attempt at solving problem:
> http://pastebin.com/Ne23NnW4
> Config: http://pastebin.com/x1qykyXW
>
This schema isn't using docvalues, due to a typo in your config.
it should not be DocValues="true" but docValues="true"
Just to get started, do you hit OOM quickly with a few expensive queries, or
is it after a number of hours and lots of queries?
Does Java heap usage seem to be growing linearly as queries come in, or are
there big spikes?
How complex/rich are your queries (e.g., how many terms, wildcards, fac
27 matches
Mail list logo