Have you done any profiling to see where the hotspots are? I realize that may be difficult on an index of that size, but maybe you can approximate on a smaller version. Also, do you have warming queries?

You might also look into setting the termIndexInterval at the Lucene level. This is not currently exposed in Solr (AFAIK), but likely could be added fairly easily as part of the index parameters. http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/index/IndexWriter.html#setTermIndexInterval(int)

-Grant

On May 13, 2009, at 5:12 PM, vivek sar wrote:

Otis,

In that case, I'm not sure why Solr is taking up so much memory as
soon as we start it up. I checked for .tii file and there is only one,

-rw-r--r-- 1 search staff 20306 May 11 21:47 ./20090510_1/data/ index/_3au.tii

I have all the cache disabled - so that shouldn't be a problem too. My
ramBuffer size is only 64MB.

I read note on sorting,
http://wiki.apache.org/solr/SchemaDesign?highlight=(sort), and see
something related to FieldCache. I don't see this as parameter defined
in either solrconfig.xml or schema.xml. Could this be something that
can load things in memory at startup? How can we disable it?

I'm trying to find out if there is a way to tell how much memory Solr
would consume and way to cap it.

Thanks,
-vivek




On Wed, May 13, 2009 at 1:50 PM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:

Hi,

Sorting is triggered by the sort parameter in the URL, not a characteristic of a field. :)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: vivek sar <vivex...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Wednesday, May 13, 2009 4:42:16 PM
Subject: Re: Solr memory requirements?

Thanks Otis.

Our use case doesn't require any sorting or faceting. I'm wondering if
I've configured anything wrong.

I got total of 25 fields (15 are indexed and stored, other 10 are just
stored). All my fields are basic data type - which I thought are not
sorted. My id field is unique key.

Is there any field here that might be getting sorted?


required="true" omitNorms="true" compressed="false"/>


compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

default="NOW/HOUR"  compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

compressed="false"/>

compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

omitNorms="true" compressed="false"/>

compressed="false"/>

compressed="false"/>

compressed="false"/>

omitNorms="true" compressed="false"/>

compressed="false"/>

default="NOW/HOUR" omitNorms="true"/>




omitNorms="true" multiValued="true"/>

Thanks,
-vivek

On Wed, May 13, 2009 at 1:10 PM, Otis Gospodnetic
wrote:

Hi,
Some answers:
1) .tii files in the Lucene index. When you sort, all distinct values for the
field(s) used for sorting. Similarly for facet fields. Solr caches.
2) ramBufferSizeMB dictates, more or less, how much Lucene/Solr will consume
during indexing. There is no need to commit every 50K docs unless you want to
trigger snapshot creation.
3) see 1) above

1.5 billion docs per instance where each doc is cca 1KB? I doubt that's going
to fly. :)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: vivek sar
To: solr-user@lucene.apache.org
Sent: Wednesday, May 13, 2009 3:04:46 PM
Subject: Solr memory requirements?

Hi,

I'm pretty sure this has been asked before, but I couldn't find a
complete answer in the forum archive. Here are my questions,

1) When solr starts up what does it loads up in the memory? Let's say I've 4 cores with each core 50G in size. When Solr comes up how much
of it would be loaded in memory?

2) How much memory is required during index time? If I'm committing 50K records at a time (1 record = 1KB) using solrj, how much memory do
I need to give to Solr.

3) Is there a minimum memory requirement by Solr to maintain a certain
size index? Is there any benchmark on this?

Here are some of my configuration from solrconfig.xml,

1) 64
2) All the caches (under query tag) are commented out
3) Few others,
      a)  true    ==>
would this require memory?
      b)  50
      c) 200
      d)
      e) false
      f)  2

The problem we are having is following,

I've given Solr RAM of 6G. As the total index size (all cores
combined) start growing the Solr memory consumption goes up. With 800
million documents, I see Solr already taking up all the memory at
startup. After that the commits, searches everything become slow. We will be having distributed setup with multiple Solr instances (around 8) on four boxes, but our requirement is to have each Solr instance at
least maintain around 1.5 billion documents.

We are trying to see if we can somehow reduce the Solr memory
footprint. If someone can provide a pointer on what parameters affect memory and what effects it has we can then decide whether we want that
parameter or not. I'm not sure if there is any minimum Solr
requirement for it to be able maintain large indexes. I've used Lucene before and that didn't require anything by default - it used up memory
only during index and search times - not otherwise.

Any help is very much appreciated.

Thanks,
-vivek





--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to