Re: Solr using a ridiculous amount of memory

2013-06-16 Thread Jack Krupansky
16, 2013 9:36 AM To: solr-user@lucene.apache.org Subject: Re: Solr using a ridiculous amount of memory It was interesting to read this post. I had similar issue on Solr v4.2.1. The nature of our document is that it has huge multiValued fields and we were able to knock off out server in about 30mu

Re: Solr using a ridiculous amount of memory

2013-06-16 Thread adityab
It was interesting to read this post. I had similar issue on Solr v4.2.1. The nature of our document is that it has huge multiValued fields and we were able to knock off out server in about 30muns We then found a bug "Lucene-4995" which was causing all the problem. Applying the patch has helped a

Re: Solr using a ridiculous amount of memory

2013-06-16 Thread Erick Erickson
John: If you'd like to add your experience to the Wiki, create an ID and let us know what it is and we'll add you to the contributors list. Unfortunately we had problems with spam pages to we added this step. Make sure you include your logon in the request. Thanks, Erick On Fri, Jun 14, 2013 at

Re: Solr using a ridiculous amount of memory

2013-06-14 Thread Toke Eskildsen
On Fri, 2013-06-14 at 14:55 +0200, John Nielsen wrote: > Sorry for not getting back to the list sooner. Time not important, only feedback important (apologies to Fifth Element). > After some major refactoring, our 15 cores have now turned into ~500 cores > and our memory consumption has dropped d

Re: Solr using a ridiculous amount of memory

2013-06-14 Thread John Nielsen
Sorry for not getting back to the list sooner. It seems like I finally solved the memory problems by following Toke's instruction of splitting the cores up into smaller chunks. After some major refactoring, our 15 cores have now turned into ~500 cores and our memory consumption has dropped dramati

Re: Solr using a ridiculous amount of memory

2013-04-19 Thread Erick Erickson
Hmmm. There has been quite a bit of work lately to support a couple of things that might be of interest (4.3, which Simon cut today, probably available to all mid next week at the latest). Basically, you can choose to pre-define all the cores in solr.xml (so-called "old style") _or_ use the new-sty

Re: Solr using a ridiculous amount of memory

2013-04-18 Thread John Nielsen
> You are missing an essential part: Both the facet and the sort > structures needs to hold one reference for each document > _in_the_full_index_, even when the document does not have any values in > the fields. > Wow, thank you for this awesome explanation! This is where the penny dropped for me.

Re: Solr using a ridiculous amount of memory

2013-04-18 Thread Toke Eskildsen
On Thu, 2013-04-18 at 11:59 +0200, John Nielsen wrote: > Yes, thats right. No search from any given client ever returns > anything from another client. Great. That makes the 1 core/client solution feasible. [No sort & facet warmup is performed] [Suggestion 1: Reduce the number of sort fields by

Re: Solr using a ridiculous amount of memory

2013-04-18 Thread John Nielsen
> > > http://172.22.51.111:8000/solr/default1_Danish/search > > [...] > > > &fq=site_guid%3a(10217) > > This constraints to hits to a specific customer, right? Any search will > only be in a single customer's data? > Yes, thats right. No search from any given client ever returns anything from anot

Re: Solr using a ridiculous amount of memory

2013-04-18 Thread Toke Eskildsen
On Thu, 2013-04-18 at 08:34 +0200, John Nielsen wrote: > [Toke: Can you find the facet fields in any of the other caches?] > Yes, here it is, in the field cache: > http://screencast.com/t/mAwEnA21yL > Ah yes, mystery solved, my mistake. > http://172.22.51.111:8000/solr/default1_Danish/search

Re: Solr using a ridiculous amount of memory

2013-04-17 Thread John Nielsen
> That was strange. As you are using a multi-valued field with the new setup, they should appear there. Yes, the new field we use for faceting is a multi valued field. > Can you find the facet fields in any of the other caches? Yes, here it is, in the field cache: http://screencast.com/t/mAwEnA

RE: Solr using a ridiculous amount of memory

2013-04-17 Thread Toke Eskildsen
Whopps. I made some mistakes in the previous post. Toke Eskildsen [t...@statsbiblioteket.dk]: > Extrapolating from 1.4M documents and 180 clients, let's say that > there are 1.4M/180/5 unique terms for each sort-field and that their > average length is 10. We thus have > 1.4M*log2(1500*10*8) + 1

RE: Solr using a ridiculous amount of memory

2013-04-17 Thread Toke Eskildsen
John Nielsen [j...@mcb.dk]: > I never seriously looked at my fieldValueCache. It never seemed to get used: > http://screencast.com/t/YtKw7UQfU That was strange. As you are using a multi-valued field with the new setup, they should appear there. Can you find the facet fields in any of the other

Re: Solr using a ridiculous amount of memory

2013-04-17 Thread John Nielsen
> I am surprised about the lack of "UnInverted" from your logs as it is logged on INFO level. Nope, no trace of it. No mention either in Logging -> Level from the admin interface. > It should also be available from the admin interface under collection/Plugin / Stats/CACHE/fieldValueCache. I neve

RE: Solr using a ridiculous amount of memory

2013-04-17 Thread Toke Eskildsen
John Nielsen [j...@mcb.dk] wrote: > I managed to get this done. The facet queries now facets on a multivalue > field as opposed to the dynamic field names. > Unfortunately it doesn't seem to have done much difference, if any at all. I am sorry to hear that. > documents = ~1.400.000 > references

Re: Solr using a ridiculous amount of memory

2013-04-17 Thread John Nielsen
I managed to get this done. The facet queries now facets on a multivalue field as opposed to the dynamic field names. Unfortunately it doesn't seem to have done much difference, if any at all. Some more information that might help: The JVM memory seem to be eaten up slowly. I dont think that the

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Upayavira
Might be obvious, but just in case - remember that you'll need to re-index your content once you've added docValues to your schema, in order to get the on-disk files to be created. Upayavira On Mon, Mar 25, 2013, at 03:16 PM, John Nielsen wrote: > I apologize for the slow reply. Today has been ki

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread John Nielsen
I did a search. I have no occurrence of "UnInverted" in the solr logs. > Another explanation for the large amount of memory presents itself if > you use a single index: If each of your clients facet on at least one > fields specific to the client ("client123_persons" or something like > that), the

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Toke Eskildsen
On Mon, 2013-04-15 at 10:25 +0200, John Nielsen wrote: > The FieldCache is the big culprit. We do a huge amount of faceting so > it seems right. Yes, you wrote that earlier. The mystery is that the math does not check out with the description you have given us. > Unfortunately I am super swamped

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread John Nielsen
Yes and no, The FieldCache is the big culprit. We do a huge amount of faceting so it seems right. Unfortunately I am super swamped at work so I have precious little time to work on this, which is what explains my silence. Out of desperation, I added another 32G of memory to each server and increa

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Toke Eskildsen
On Sun, 2013-03-24 at 09:19 +0100, John Nielsen wrote: > Our memory requirements are running amok. We have less than a quarter of > our customers running now and even though we have allocated 25GB to the JVM > already, we are still seeing daily OOM crashes. Out of curiosity: Did you manage to pinp

Re: Solr using a ridiculous amount of memory

2013-03-25 Thread John Nielsen
I apologize for the slow reply. Today has been killer. I will reply to everyone as soon as I get the time. I am having difficulties understanding how docValues work. Should I only add docValues to the fields that I actually use for sorting and faceting or on all fields? Will the docValues magic

Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Jack Krupansky
nt: Sunday, March 24, 2013 2:00 PM To: solr-user@lucene.apache.org Subject: Re: Solr using a ridiculous amount of memory Just to get started, do you hit OOM quickly with a few expensive queries, or is it after a number of hours and lots of queries? Does Java heap usage seem to be growing linearly

RE: Solr using a ridiculous amount of memory

2013-03-24 Thread Toke Eskildsen
Toke Eskildsen [t...@statsbiblioteket.dk]: > If your whole index has 10M documents, which each has 100 values > for each field, with each field having 50M unique values, then the > memory requirement would be more than > 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~= > 1.6GB for face

RE: Solr using a ridiculous amount of memory

2013-03-24 Thread Toke Eskildsen
From: John Nielsen [j...@mcb.dk]: > The index is about 35GB on disk with each register between 15k and 30k. > (This is simply the size of a full xml reply of one register. I'm not sure > how to measure it otherwise.) > Our memory requirements are running amok. We have less than a quarter of > our

Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Robert Muir
On Sun, Mar 24, 2013 at 4:19 AM, John Nielsen wrote: > Schema with DocValues attempt at solving problem: > http://pastebin.com/Ne23NnW4 > Config: http://pastebin.com/x1qykyXW > This schema isn't using docvalues, due to a typo in your config. it should not be DocValues="true" but docValues="true"

Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Jack Krupansky
Just to get started, do you hit OOM quickly with a few expensive queries, or is it after a number of hours and lots of queries? Does Java heap usage seem to be growing linearly as queries come in, or are there big spikes? How complex/rich are your queries (e.g., how many terms, wildcards, fac