Re: Is there a way to customize segment names?

Robert Muir Fri, 16 Dec 2022 02:41:29 -0800

You are still talking "Multiple writers". Like i said, going down this
path (playing tricks with filenames) isn't going to work out well.


On Fri, Dec 16, 2022 at 2:48 AM Patrick Zhai <[email protected]> wrote:
>
> Hi Robert,
>
> Maybe I didn't explain it clearly but we're not going to constantly switch
> between writers or share effort between writers, it's purely for
> availability: the second writer only kicks in when the first writer is not
> available for some reason.
> And as far as I know the replicator/nrt module has not provided a solution
> on when the primary node (main indexer) is down, how would we recover with
> a back up indexer?
>
> Thanks
> Patrick
>
>
> On Thu, Dec 15, 2022 at 7:16 PM Robert Muir <[email protected]> wrote:
>
> > This multiple-writer isn't going to work and customizing names won't
> > allow it anyway. Each file also contains a unique identifier tied to
> > its commit so that we know everything is intact.
> >
> > I would look at the segment replication in lucene/replicator and not
> > try to play games with files and mixing multiple writers.
> >
> > On Thu, Dec 15, 2022 at 5:45 PM Patrick Zhai <[email protected]> wrote:
> > >
> > > Hi Folks,
> > >
> > > We're trying to build a search architecture using segment replication
> > (indexer and searcher are separated and indexer shipping new segments to
> > searchers) right now and one of the problems we're facing is: for
> > availability reason we need to have multiple indexers running, and when the
> > searcher is switching from consuming one indexer to another, there are
> > chances where the segment names collide with each other (because segment
> > names are count based) and the searcher have to reload the whole index.
> > > To avoid that we're looking for a way to name the segments so that
> > Lucene is able to tell the difference and load only the difference (by
> > calling `openIfChanged`). I've checked the IndexWriter and the
> > DocumentsWriter and it seems it is controlled by a private final method
> > `newSegmentName()` so likely not possible there. So I wonder whether
> > there's any other ways people are aware of that can help control the
> > segment names?
> > >
> > > A example of the situation described above:
> > > Searcher previously consuming from indexer 1, and have following
> > segments: _1, _2, _3, _4
> > > Indexer 2 previously sync'd from indexer 1, sharing the first 3
> > segments, and produced its own 4th segments (notioned as _4', but it shares
> > the same "_4" name): _1, _2, _3, _4'
> > > Suddenly Indexer 1 dies and searcher switched from Indexer 1 to Indexer
> > 2, then when it finished downloading the segments and trying to refresh the
> > reader, it will likely hit the exception here, and seems all we can do
> > right now is to reload the whole index and that could be potentially a high
> > cost.
> > >
> > > Sorry for the long email and thank you in advance for any replies!
> > >
> > > Best
> > > Patrick
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Is there a way to customize segment names?

Reply via email to