Re: "docMap" array in SegmentMergeInfo

Doug Cutting Wed, 13 Jul 2005 14:31:26 -0700

Lokesh Bajaj wrote:

For a very large index where we might want to delete/replace some documents, 
this would require a lot of memory (for 100 million documents, this would need 
381 MB of memory). Is there any reason why this was implemented this way?

In practice this has not been an issue. A single index with 100Mdocuments is usually quite slow to search. When collections get thisbig folks tend to instead search multiple indexes in parallel in orderto keep response times acceptable. Also, 381Mb of RAM is often not aproblem for folks with 100M documents. But this is not to say that itcould never be a problem. For folks with limited RAM and/or lots ofsmall documents it could indeed be an issue.

It seems like this could be implemented as a much smaller array that only keeps 
track of the deleted document numbers and it would still be very efficient to 
calculate the new document number by using this much smaller array. Has this 
been done by anyone else or been considered for change in the Lucene code?


Please submit a patch to the java-dev list.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: "docMap" array in SegmentMergeInfo

Reply via email to