Given history, perhaps the default merge policy should conserve this,
but with pluggable merge policies, I don't see that all merge policies
need to.

-----Original Message-----
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 23, 2007 1:53 AM
To: java-dev@lucene.apache.org
Subject: Re: [jira] Updated: (LUCENE-843) improve how IndexWriter uses
RAM to buffer added documents


"Chris Hostetter" <[EMAIL PROTECTED]> wrote:
> : > Actually is #2 a hard requirement?
> :
> : A lot of Lucene users depend on having document number correspond to
> : age, I think.  ISTR Hatcher at least recommending techniques that
> : require it.
> 
> "Corrispond to age" may be missleading as it implies that the actual
> docid has meaning ... it's more that the relative order of addition is
> preserved regardless of deletions/merging
> 
> A trivial example of using this is getting the N newest documents
> matching
> a search using a HitCollector, it's just a bounded queue that only
> remembers the last N things you put in it.
> 
> An more complicated example is duplicate unique field detection:
> iterating
> over a TermDoc and knowing that the doc with the higheest docId is the
> last one added, so the earlier ones can be ignored/deleted.  (as i
> recall,
> Solr takes advantage of this.)

Got it, so we need to preserve this invariant.  So this puts the
general restriction on the Lucene merge policy that only adjacent
segments (ie, when ordered by segment number) can be merged.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to