Hi Henrik,

when the IndexWriter is opened, the term dictionary is loaded into RAM.
So memory usage is certainly dependent on the number of unique terms in
the index.

The entire term dictionary isn't actually loaded, just an even spread of
terms.  The :index_skip_interval parameter allows you to twiddle this
spread - the higher the skip interval, the less memory will be used, but
the slower your searches.

Play with this parameter and see if it improves things for you - if not,
at least you know it's not down to having lots of unique terms.

Tbh, probably a long shot, but worth a look.

John.


On Mon, 2007-06-04 at 12:29 +0200, Henrik Zagerholm wrote:
> Hi list,
> 
> We just built our own ferret drb server (mostly because we don't do  
> an indexing from within rails).
> 
> The ferret drb server only handles index inserts and some deletes.
> Usually we make batch inserts were we retrieve a couple of hundred or  
> thousands of documents from a database and then inserts them inte  
> ferret one by one.
> We call flush every 50th file. We are very impressed with the insert  
> speeds 56 000 documents with varying size in 32 minutes.
> 
> When started the ferret drb server takes about 9 MB ram but after its  
> been running for a while doing some indexing it reaches about 150 MB  
> RAM and when indexing is finished it still stays around 130 MB.
> We do manual GC.start at the end of every batch indexing.
> 
> The index is now about 2.7 Gb.
> 
> Any suggestions on what can be wrong?
> Maybe its natural for a ferret drb with an 2.7G index to use that  
> much memory when idle?
> 
> Please let me know if you need any more info.
> 
> Regards,
> Henrik
-- 
http://johnleach.co.uk

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to