Re: Actual min and max-value of NumericField during codec flush

Michael McCandless Fri, 14 Feb 2014 12:25:29 -0800

On Fri, Feb 14, 2014 at 12:14 AM, Ravikumar Govindarajan
<ravikumar.govindara...@gmail.com> wrote:


> Early-Query termination quits by throwing an Exception right?. Is it ok to
> individually search using SegmentReader and then break-off, instead of
> using a MultiReader, especially when the order is known before search
> begins?

Well, this will change your scores?  MultiReader will sum up all term
statistics across all SegmentReaders "up front", and then scoring per
segment will use those top-level weights.

> The reason why I insisted on a time-stamp based merging is because there is
> a possiblility of an out-of-order segment added via addIndex(...) call.
> That segment can be of any older time-stamp [month ago, year-ago etc...],
> albeit extremely rare. Should I worry about it during merges, or just
> handle overlaps during search

Which addIndexes method are you using?  The one taking Directory[]
does file-level copies, assigning sequential segment names (but this
is not guaranteed), and the one taking IndexReader[] merges all the
incoming indices into a single segment.

You may need to just impl a custom MergePolicy that sorts all segments
in the index by timestamp and picks the merge order accordingly...

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Actual min and max-value of NumericField during codec flush

Reply via email to