A model for predicting indexing memory costs?

mark harwood Mon, 09 Mar 2009 03:45:20 -0700

I've been building a large index (hundreds of millions) with mainly structured 
data which consists of several fields with mostly unique values.
I've been hitting out of memory issues when doing periodic commits/closes which 
I suspect is down to the sheer number of terms.


I set the IndexWriter..setTermIndexInterval to 8 times the normal size of 128 
(an intervalof 1024) which delayed the onset of the issue but still failed.
I'd like to get a little more scientific about what to set here rather than 
simply experimenting with settings and hoping it doesn't fail again.

Does anyone have a decent model worked out for how much memory is consumed at 
peak? I'm guessing the contributing factors are:

* Numbers of fields
* Numbers of unique terms per field
* Numbers of segments?

Cheers,
Mark






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

A model for predicting indexing memory costs?

Reply via email to