I've been building a large index (hundreds of millions) with mainly structured data which consists of several fields with mostly unique values. I've been hitting out of memory issues when doing periodic commits/closes which I suspect is down to the sheer number of terms.
I set the IndexWriter..setTermIndexInterval to 8 times the normal size of 128 (an intervalof 1024) which delayed the onset of the issue but still failed. I'd like to get a little more scientific about what to set here rather than simply experimenting with settings and hoping it doesn't fail again. Does anyone have a decent model worked out for how much memory is consumed at peak? I'm guessing the contributing factors are: * Numbers of fields * Numbers of unique terms per field * Numbers of segments? Cheers, Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
