Currently IndexWriter.flushRamSegments() always merge all ram segments to
disk. Later it may merge more, depending on the maybe-merge algorithm. This
happens at closing the index and when the number of (1 doc) (ram) segments
exceeds max-buffered-docs.
Can there be a performance penalty for always merging to disk first?
Assume the following merges take place:
merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs) into
_a (N docs)
merging segments _6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)
Alternatively, we could tell (compute) that this is going to happen, and
have a single merge:
merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs)
_6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)
This would save writing the segemnt of size N to disk and reading it again.
For large enough N, Is there really potential save here?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]