[ANNOUNCE] Apache Lucene 9.8.0 released

2023-09-28 Thread Patrick Zhai
The Lucene PMC is pleased to announce the release of Apache Lucene 9.8.0. Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting,

Re: Question about index segment search order

2023-05-04 Thread Patrick Zhai
nks, > Wei > > On Thu, May 4, 2023 at 3:33 AM Michael Sokolov wrote: > > > There is no meaning to the sequence. The segments are created > concurrently > > by many threads and the merge process will merge them without regards to > > any ordering. > > > &g

Re: Question about index segment search order

2023-05-03 Thread Patrick Zhai
s visited first? > > Wei > > On Tue, May 2, 2023 at 7:22 PM Patrick Zhai wrote: > > > Hi Wei, > > Lucene in general iterate through the index in the order of what is > > recorded in the SegmentInfos > > < > > > https://github.com/apache/l

Re: Question about index segment search order

2023-05-02 Thread Patrick Zhai
Hi Wei, Lucene in general iterate through the index in the order of what is recorded in the SegmentInfos And at search time, you can specify the order using LeafSorter

Re: Question about searcherManager applyAllDeletes parameter and maybeRefresh method

2023-03-03 Thread Patrick Zhai
> result right? > I even ran it more than 10k times, but I never hit a case where the search > result contains the deleted doc (meaning the delete has not been applied). > > > Sincerely, > Ningshan > > On Thu, Mar 2, 2023 at 3:30 PM Patrick Zhai wrote: > >> Hi Nin

Re: Question about searcherManager applyAllDeletes parameter and maybeRefresh method

2023-03-02 Thread Patrick Zhai
Hi Ningshan, If you want to make sure the deletes are applied after you call maybeRefresh() then you need to set the applyAllDeletes to be true. A bit more details: The constructor of SearcherManager actually internally passes the applyAllDeletes to the IndexWriter, which then will pass it to the

Re: Is there a way to customize segment names?

2022-12-30 Thread Patrick Zhai
; No, you can't control them. And we must not open up anything to try to > support this. > > On Fri, Dec 16, 2022 at 7:28 PM Patrick Zhai wrote: > > > > Hi Mike, Robert > > > > Thanks for replying, the system is almost like what Mike has described: > one writer i

Re: Is there a way to customize segment names?

2022-12-16 Thread Patrick Zhai
down this > > path (playing tricks with filenames) isn't going to work out well. > > > > On Fri, Dec 16, 2022 at 2:48 AM Patrick Zhai wrote: > > > > > > Hi Robert, > > > > > > Maybe I didn't explain it clearly but we're not going to consta

Re: Is there a way to customize segment names?

2022-12-15 Thread Patrick Zhai
e also contains a unique identifier tied to > its commit so that we know everything is intact. > > I would look at the segment replication in lucene/replicator and not > try to play games with files and mixing multiple writers. > > On Thu, Dec 15, 2022 at 5:45 PM Patrick Zhai

Is there a way to customize segment names?

2022-12-15 Thread Patrick Zhai
Hi Folks, We're trying to build a search architecture using segment replication (indexer and searcher are separated and indexer shipping new segments to searchers) right now and one of the problems we're facing is: for availability reason we need to have multiple indexers running, and when the

Re: ContainingIntervalsSource alternative

2021-06-02 Thread Patrick Zhai
Hi Elbek, Maybe go with ContainedByIntervalsSource? ContainingIntervalsSource is actually the big source filtered by small source, and ContainedByIntervalsSource is the opposite so it should give the expect behavior? Best Patrick elbek kamoliddinov 于2021年6月2日周三 下午2:55写道: > Hello everyone, > >

Re: Multiple merge-runs from same set of segments

2021-05-27 Thread Patrick Zhai
e do use the default MMap-dir but I was actually thinking about > unpacking/walking Term-Dict data (FST) repeatedly from various > threads, even if via MMap. Are there optimizations here (caching unpacked > blocks etc..) that we could tap into? > > -- > Ravi > > On Mon, May 24

Re: Multiple merge-runs from same set of segments

2021-05-24 Thread Patrick Zhai
Hi Ravi, 1. May I know what lucene version you're using? As far as I know the SortingMergePolicy has been deprecated and replaced by IndexWriterConfig.setIndexSort in newer lucene version. So if the "setIndexSort" is available I would suggest using that to achieve the sorted index (as you might