Bernd, in our case, optimizing the index seems to flush the FieldCache for some reason. On the other hand, doing a few commits without optimizing seems to make the problem worse.
Hope that helps, we would like to give it a try and debug this in Lucene, but are pressed for time right now. Perhaps later next week we will. Best, Santiago On Fri, Jul 22, 2011 at 4:01 AM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > The current status of my installation is that with some tweeking of > JAVA I get a runtime of about 2 weeks until OldGen (14GB) is filled > to 100 percent and won't free anything even with FullGC. > The part of fieldCache in a HeapDump to that time is over 80 percent > from the whole heap (20GB). And that is what eats up all OldGen > until OOM. > Next week I will start with tomcat 6.x to see how that one behaves, but > there isn't any hope. It is just a different container which wouldn't > change anything about how Lucene eats up memory with fieldCache. > > After digging through all the code, logging and debugging I can say that it > seams to be not a memory leak. > > Solr is using the fieldCache from Lucene under the hood of the servlet > container. > The fieldCache grows until everything cachable is in memory or OOM > is reached, what ever comes first. > > The description says: "Provides introspection of the Lucene FieldCache, > this is **NOT** a cache that is managed by Solr." > So it seams to be a Lucene problem. > > As a matter of fact and due to this limitation solr can't be used > with a single huge index. I don't know how other applications which are > using Lucene and it's fieldCache (and there are a lot of them) are > handling this and how they take care of the size of the fieldCache. > And, I currently don't know how to calculate the limit. > Say for example: the size of *.tii and *.tis file in the index should be > the -Xmx size of your JAVA to be save with fieldCache and > OOM. > > May be an expert can give more detailed info about fieldCache and its > possible maximum size. > > Some data about our index: > -rw-r--r-- 1 solr users 84448291214 19. Jul 10:43 _12jl.fdt > -rw-r--r-- 1 solr users 236458468 19. Jul 10:43 _12jl.fdx > -rw-r--r-- 1 solr users 1208 19. Jul 10:30 _12jl.fnm > -rw-r--r-- 1 solr users 19950615826 19. Jul 11:20 _12jl.frq > -rw-r--r-- 1 solr users 532031548 19. Jul 11:20 _12jl.nrm > -rw-r--r-- 1 solr users 20616887682 19. Jul 11:20 _12jl.prx > -rw-r--r-- 1 solr users 291149087 19. Jul 11:20 _12jl.tii > -rw-r--r-- 1 solr users 30850743727 19. Jul 11:20 _12jl.tis > -rw-r--r-- 1 solr users 20 9. Jun 11:11 segments.gen > -rw-r--r-- 1 solr users 274 19. Jul 11:20 segments_pl > Size: 146,15 GB > Docs: 29.557.308 > > > Regards, > Bernd > > > Am 22.07.2011 00:10, schrieb Santiago Bazerque: > > Hello Erick, >> >> I have a 1.7MM documents, 3.6GB index. I also hava an unusual amount of >> dynamic fields, that I use for sorting. My FieldCache currently has about >> 13.000 entries, even though my index only has 1-3 queries per second. Each >> query sorts by two dynamic fields, and facets on 3-4 fields that are >> fixed. >> These latter fields are always in the field cache, what I find suspicious >> is >> the other ~13.000 that are sitting there. >> >> I am using a 32GB heap, and I am seeing periodical OOM errors (I didn't >> spot >> a regular pattern as Bernd did, but haven't increased RAM as methodically >> as >> he has). >> >> If you need any more info, I'll be glad to post it to the list. >> >> Best, >> Santiago >> >> On Fri, Jun 17, 2011 at 9:13 AM, Erick >> Erickson<erickerickson@gmail.**com<erickerick...@gmail.com> >> >wrote: >> >> Sorry, it was late last night when I typed that... >>> >>> Basically, if you sort and facet on #all# the fields you mentioned, it >>> should populate >>> the cache in one go. If the problem is that you just have too many unique >>> terms >>> for all those operations, then it should go bOOM. >>> >>> But, frankly, that's unlikely, I'm just suggesting that to be sure the >>> easy case isn't >>> the problem. Take a memory snapshot at that point just to see, it should >>> be >>> a >>> high-water mark. >>> >>> The fact that you increase the heap and can then run for longer is >>> extremely >>> suspicious, and really smells like a memory issue, so we'd like to pursue >>> it. >>> >>> I'd be really interested if anyone else is seeing anything similar, >>> these are the >>> scary ones... >>> >>> Best >>> Erick >>> >>> On Fri, Jun 17, 2011 at 3:09 AM, Bernd Fehling >>> <bernd.fehling@uni-bielefeld.**de <bernd.fehl...@uni-bielefeld.de>> >>> wrote: >>> >>>> Hi Erik, >>>> I will take some memory snapshots during the next week, >>>> but how can it be to get OOMs with one query? >>>> >>>> - I started with 6g for JVM --> 1 day until OOM. >>>> - increased to 8 g --> 2 days until OOM >>>> - increased to 10g --> 3.5 days until OOM >>>> - increased to 16g --> 5 days until OOM >>>> - currently 20g --> about 7 days until OOM >>>> >>>> Starting the system takes about 3.5g and goes up to about 4g after a >>>> >>> while. >>> >>>> >>>> The only dirty workaround so far is to restart the whole system after 5 >>>> days. >>>> Not really nice. >>>> >>>> The problem seams to be fieldCache which is under the hood of jetty. >>>> Do you know of any sizing features for fieldCache to limit the memory >>>> consumption? >>>> >>>> Regards >>>> Bernd >>>> >>>> Am 17.06.2011 03:37, schrieb Erick Erickson: >>>> >>>>> >>>>> Well, if my theory is right, you should be able to generate OOMs at >>>>> will >>>>> by >>>>> sorting and faceting on all your fields in one query. >>>>> >>>>> But Lucene's cache should be garbage collected, can you take some >>>>> memory >>>>> snapshots during the week? It should hit a point and stay steady there. >>>>> >>>>> How much memory are you giving your JVM? It looks like a lot given your >>>>> memory snapshot. >>>>> >>>>> Best >>>>> Erick >>>>> >>>>> On Thu, Jun 16, 2011 at 3:01 AM, Bernd Fehling >>>>> <bernd.fehling@uni-bielefeld.**de <bernd.fehl...@uni-bielefeld.de>> >>>>> wrote: >>>>> >>>>>> >>>>>> Hi Erik, >>>>>> >>>>>> yes I'm sorting and faceting. >>>>>> >>>>>> 1) Fields for sorting: >>>>>> sort=f_dccreator_sort, sort=f_dctitle, sort=f_dcyear >>>>>> The parameter "facet.sort=" is empty, only using parameter "sort=". >>>>>> >>>>>> 2) Fields for faceting: >>>>>> f_dcperson, f_dcsubject, f_dcyear, f_dccollection, f_dclang, >>>>>> f_dctypenorm, >>>>>> f_dccontenttype >>>>>> Other faceting parameters: >>>>>> >>>>>> >>>>>> >>>>>> ...&facet=true&facet.mincount=**1&facet.limit=100&facet.sort=&** >>> facet.prefix=&... >>> >>>> >>>>>> 3) The LukeRequestHandler takes too long for my huge index so this is >>>>>> from >>>>>> the standalone luke (compiled for solr3.2): >>>>>> f_dccreator_sort = 10.029.196 >>>>>> f_dctitle = 21.514.939 >>>>>> f_dcyear = 1.471 >>>>>> f_dcperson = 14.138.165 >>>>>> f_dcsubject = 8.012.319 >>>>>> f_dccollection = 1.863 >>>>>> f_dclang = 299 >>>>>> f_dctypenorm = 14 >>>>>> f_dccontenttype = 497 >>>>>> >>>>>> numDocs: 28.940.964 >>>>>> numTerms: 686.813.235 >>>>>> optimized: true >>>>>> hasDeletions: false >>>>>> >>>>>> What can you read/calculate from this values? >>>>>> >>>>>> Is my index to big for Lucene/Solr? >>>>>> >>>>>> What I don't understand, why fieldCache is not garbage collected >>>>>> and therefore reduced in size from time to time. >>>>>> >>>>>> Regards >>>>>> Bernd >>>>>> >>>>>> Am 15.06.2011 17:50, schrieb Erick Erickson: >>>>>> >>>>>>> >>>>>>> The first question I have is whether you're sorting and/or >>>>>>> faceting on many unique string values? I'm guessing >>>>>>> that sometime you are. So, some questions to help >>>>>>> pin it down: >>>>>>> 1> what fields are you sorting on? >>>>>>> 2> what fields are you faceting on? >>>>>>> 3> how many unique terms in each (see the solr admin page). >>>>>>> >>>>>>> Best >>>>>>> Erick >>>>>>> >>>>>>> On Wed, Jun 15, 2011 at 8:22 AM, Bernd Fehling >>>>>>> <bernd.fehling@uni-bielefeld.**de <bernd.fehl...@uni-bielefeld.de>> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> Dear list, >>>>>>>> >>>>>>>> after getting OOM exception after one week of operation with >>>>>>>> solr 3.2 I used MemoryAnalyzer for the heapdumpfile. >>>>>>>> It looks like the fieldCache eats up all memory. >>>>>>>> >>>>>>>> Objects >>>>>>>> Shalow >>>>>>>> Heap >>>>>>>> Retained Heap >>>>>>>> org.apache.lucene.search.**FieldCache 0 >>>>>>>> 0 >>>>>>>> >>>>>>>>> >>>>>>>>> = 14,636,950,632 >>>>>>>>> >>>>>>>> >>>>>>>> org.apache.lucene.search.**FieldCacheImpl 1 >>>>>>>> 32 >>>>>>>> >>>>>>>>> >>>>>>>>> = 14,636,950,384 >>>>>>>>> >>>>>>>> >>>>>>>> org.apache.lucene.search.**FieldCacheImpl$**StringIndexCache 1 >>>>>>>> 32 >>>>>>>> >>>>>>>>> >>>>>>>>> = 14,636,947,080 >>>>>>>>> >>>>>>>> >>>>>>>> org.apache.lucene.search.**FieldCache$StringIndex 10 >>>>>>>> 320 >>>>>>>> >>>>>>>>> >>>>>>>>> = 14,636,944,352 >>>>>>>>> >>>>>>>> >>>>>>>> java.lang.String[] 519 >>>>>>>> 567,811,040 >>>>>>>> >>>>>>>>> >>>>>>>>> = 13,503,733,312 >>>>>>>>> >>>>>>>> >>>>>>>> char[] 81,766,595 >>>>>>>> 11,604,293,712 >>>>>>>> >>>>>>>>> >>>>>>>>> = 11,604,293,712 >>>>>>>>> >>>>>>>> >>>>>>>> fieldCache retains over 14g of heap. >>>>>>>> >>>>>>>> When looking on stats page under fieldCache the description says: >>>>>>>> "Provides introspection of the Lucene FieldCache, this is **NOT** a >>>>>>>> cache >>>>>>>> that is managed by Solr." >>>>>>>> >>>>>>>> So is this a jetty problem and not solr? >>>>>>>> >>>>>>>> Why is fieldCache growing and growing until OOM? >>>>>>>> >>>>>>>> Regards >>>>>>>> Bernd >>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> > -- > ***************************************************************** > Bernd Fehling Universitätsbibliothek Bielefeld > Dipl.-Inform. (FH) Universitätsstr. 25 > Tel. +49 521 106-4060 Fax. +49 521 106-4052 > bernd.fehl...@uni-bielefeld.de 33615 Bielefeld > > BASE - Bielefeld Academic Search Engine - www.base-search.net > ***************************************************************** >