Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

Ning Li Tue, 05 Sep 2006 15:28:58 -0700

What about an invariant that says the number of main index segments
with the same level (f(n)) should be less than M.


That is exactly what the second property says:
"Less than M number of segments whose doc count n satisfies B*(M^c) <=
n < B*(M^(c+1)) for any c >= 0."

In other words, less than M number of segments with the same f(n).

I am concerned about corner cases causing tons of segments and slowing
search or causing errors due to file descriptor exhaustion.

When merging, maybe we should count the number of segments at a
particular index level f(n), rather than adding up the number of
documents.  In the presence of deletions, this should lead to faster
indexing (due to less frequent merges) I think.


Given M, B and an index which has L (0 < L < M) segments with docs
less than B, how many ram docs should be accumulated before a merge is
triggered? B is not good. B-sum(L) is the old strategy which has
problems. So between B-sum(L) and B? Once there are M segments with
docs less than B, they'll be merged. But what if L=0? Should B ram
docs be accumulated before flushed in that case?

In any case, if flushing ram docs causes the the number of segments
with <B docs to reach M in close(), a merge with those segments should
be triggered.

What is the behavior of your patch under the current scenario:

M=10, B=1000
open writer, add 3 docs, close writer
open writer, add 1000 docs, close writer

Do you avoid the situation of having segments with docs=3 and 1000
(hence f(n) increases as you increase segment numbers... a no-no)?


Currently, it does result in segments with docs=3 and 1000. I'll
modify the patch so that it completely complies with all the index
invariants once an agreement is reached.

Ning

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

Reply via email to