Re: Good merge settings for interactively maintained index

2014-12-04 Thread Michael McCandless
OK I ran a quick test using Wikipedia docs; net/net I think TieredMergePolicy's (the default) behavior is fine. Once a too-large segment has > 50% deletes it is eligible for merging and will be aggressively merged. To visualize this, I first built a 33.3M doc Wikipedia index (append only), then r

Re: Good merge settings for interactively maintained index

2014-12-04 Thread Michael McCandless
25-40% is definitely "normal" for an index where many docs are being replaced; I've seen this go up to ~65% before large merges bring it back down. On 2) there may be some improvements we can make to Lucene default TieredMergePolicy here, to reclaim deletes for the "too large" segments ... I'll ha

Re: Good merge settings for interactively maintained index

2014-12-04 Thread Michal Taborsky
Hello Nikolas, we are facing similar behavior. Did you find out anything? Thank you, Michal Dne pondělí, 8. září 2014 22:55:12 UTC+2 Nikolas Everett napsal(a): > > My indexes change somewhat frequently. If I let leave the merge settings > as the default I end up with 25%-40% deleted documents

Good merge settings for interactively maintained index

2014-09-08 Thread Nikolas Everett
My indexes change somewhat frequently. If I let leave the merge settings as the default I end up with 25%-40% deleted documents (some indexes higher, some lower). I'm looking for some generic advice on: 1. Is that 25%-40% ok? 2. What kind of settings should I set to keep that in an acceptable r