Hi Michael,

Yes the collector counts hits across all segments. Thanks for the
suggestion, I'm also asking the question on solr-dev.

Wei

On Thu, May 11, 2023 at 11:57 AM Michael Sokolov <msoko...@gmail.com> wrote:

> Maybe ask this issue on solr-dev then? I'm not familiar with how that
> collector works. Does it count hits across all segments? only within a
> single segment?
>
> On Tue, May 9, 2023 at 1:36 PM Wei <weiwan...@gmail.com> wrote:
> >
> > Hi Michael,
> >
> > I am applying early termination with Solr's EarlyTerminatingCollector
> >
> https://github.com/apache/solr/blob/d9ddba3ac51ece953d762c796f62730e27629966/solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java
> > ,
> > which triggers EarlyTerminatingCollectorException in SolrIndexSearcher
> >
> https://github.com/apache/solr/blob/d9ddba3ac51ece953d762c796f62730e27629966/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L281
> >
> > Thanks,
> > Wei
> >
> >
> > On Thu, May 4, 2023 at 11:47 AM Michael Sokolov <msoko...@gmail.com>
> wrote:
> >
> > > Yes, sorry I didn't mean to imply you couldn't control this if you
> > > want to. I guess in the typical setup it is not predictable. How are
> > > you applying early termination? Are you using a standard Lucene
> > > Collector or do you have your own?
> > >
> > > On Thu, May 4, 2023 at 2:03 PM Patrick Zhai <zhai7...@gmail.com>
> wrote:
> > > >
> > > > Hi Mike,
> > > > Just want to mention if the user chooses to use single thread to
> index
> > > and
> > > > use LogXXMergePolicy then the document order will be preserved as
> index
> > > > order.
> > > >
> > > >
> > > >
> > > > On Thu, May 4, 2023 at 10:04 AM Wei <weiwan...@gmail.com> wrote:
> > > >
> > > > > Hi Michael,
> > > > >
> > > > > We are interested in the segment sequence for early termination.
> In our
> > > > > case there is always a large dominant segment after index rebuild,
> > > then
> > > > > many small segments are generated with continuous updates as time
> goes
> > > by.
> > > > > When early termination is applied, the limit could be reached just
> for
> > > > > traversing the dominant segment alone and the newer smaller
> segments
> > > > > doesn't get a chance.  If we can control the segment sequence so
> that
> > > the
> > > > > newer segments are visited first, the documents with recent updates
> > > can be
> > > > > retrieved with early termination.  Do you think this makes sense?
> Any
> > > > > suggestion is appreciated.
> > > > >
> > > > > Thanks,
> > > > > Wei
> > > > >
> > > > > On Thu, May 4, 2023 at 3:33 AM Michael Sokolov <msoko...@gmail.com
> >
> > > wrote:
> > > > >
> > > > > > There is no meaning to the sequence. The segments are created
> > > > > concurrently
> > > > > > by many threads and the merge process will merge them without
> > > regards to
> > > > > > any ordering.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, May 3, 2023, 1:09 PM Patrick Zhai <zhai7...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > For that part I'm not entirely sure, if other folks know it
> please
> > > > > chime
> > > > > > in
> > > > > > > :)
> > > > > > >
> > > > > > > On Wed, May 3, 2023 at 8:48 AM Wei <weiwan...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > > Thanks Patrick! In the default case when no LeafSorter is
> > > provided,
> > > > > are
> > > > > > > the
> > > > > > > > segments traversed in the order of creation time, i.e. the
> oldest
> > > > > > segment
> > > > > > > > is always visited first?
> > > > > > > >
> > > > > > > > Wei
> > > > > > > >
> > > > > > > > On Tue, May 2, 2023 at 7:22 PM Patrick Zhai <
> zhai7...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Wei,
> > > > > > > > > Lucene in general iterate through the index in the order of
> > > what is
> > > > > > > > > recorded in the SegmentInfos
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L140
> > > > > > > > > >
> > > > > > > > > And at search time, you can specify the order using
> LeafSorter
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java#L75
> > > > > > > > > >
> > > > > > > > > when you're opening the IndexReader
> > > > > > > > >
> > > > > > > > > Patrick
> > > > > > > > >
> > > > > > > > > On Tue, May 2, 2023 at 5:28 PM Wei <weiwan...@gmail.com>
> > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > We have a index that has multiple segments generated with
> > > > > > continuous
> > > > > > > > > > updates. Does Lucene  have a specific order when iterate
> > > through
> > > > > > the
> > > > > > > > > > segments (assuming single query thread) ? Can the order
> be
> > > > > > customized
> > > > > > > > > that
> > > > > > > > > > the latest generated segments are searched first?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Wei
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to