> EG if you set maxBufferedDocs to say 10000 but then it turns out based > on RAM usage you actually flush every 300 docs then the merge policy > will incorrectly merge a level 1 segment (with 3000 docs) in with the > level 0 segments (with 300 docs). This is because the merge policy > looks at the current value of maxBufferedDocs to compute the levels > so a 3000 doc segment and a 300 doc segment all look like "level 0".
Are you calling the 3K segment a level 1 segment because it was created from level 0 segments? Because based on size, it is a level 0 segment, right? With the current merge policy, you can merge level n segments and get a level n segment. Deletes will do this, plus other things like changing merge policy parameters and combining indexes. Leads to the question of what is "over merging". The current merge policy doesn't consider the size of the result, it simply counts the number of segments at a level. Do you think this qualifies as over merging? It still should only merge when there are mergeFactor segments at a level, so you shouldn't be doing too terribly much merging. And you have to be careful not to do less, right? By bounding the number of segments at each level, you ensure that your file descriptor usage only grows logarithmically. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
