You are still talking "Multiple writers". Like i said, going down this path (playing tricks with filenames) isn't going to work out well.
On Fri, Dec 16, 2022 at 2:48 AM Patrick Zhai <zhai7...@gmail.com> wrote: > > Hi Robert, > > Maybe I didn't explain it clearly but we're not going to constantly switch > between writers or share effort between writers, it's purely for > availability: the second writer only kicks in when the first writer is not > available for some reason. > And as far as I know the replicator/nrt module has not provided a solution > on when the primary node (main indexer) is down, how would we recover with > a back up indexer? > > Thanks > Patrick > > > On Thu, Dec 15, 2022 at 7:16 PM Robert Muir <rcm...@gmail.com> wrote: > > > This multiple-writer isn't going to work and customizing names won't > > allow it anyway. Each file also contains a unique identifier tied to > > its commit so that we know everything is intact. > > > > I would look at the segment replication in lucene/replicator and not > > try to play games with files and mixing multiple writers. > > > > On Thu, Dec 15, 2022 at 5:45 PM Patrick Zhai <zhai7...@gmail.com> wrote: > > > > > > Hi Folks, > > > > > > We're trying to build a search architecture using segment replication > > (indexer and searcher are separated and indexer shipping new segments to > > searchers) right now and one of the problems we're facing is: for > > availability reason we need to have multiple indexers running, and when the > > searcher is switching from consuming one indexer to another, there are > > chances where the segment names collide with each other (because segment > > names are count based) and the searcher have to reload the whole index. > > > To avoid that we're looking for a way to name the segments so that > > Lucene is able to tell the difference and load only the difference (by > > calling `openIfChanged`). I've checked the IndexWriter and the > > DocumentsWriter and it seems it is controlled by a private final method > > `newSegmentName()` so likely not possible there. So I wonder whether > > there's any other ways people are aware of that can help control the > > segment names? > > > > > > A example of the situation described above: > > > Searcher previously consuming from indexer 1, and have following > > segments: _1, _2, _3, _4 > > > Indexer 2 previously sync'd from indexer 1, sharing the first 3 > > segments, and produced its own 4th segments (notioned as _4', but it shares > > the same "_4" name): _1, _2, _3, _4' > > > Suddenly Indexer 1 dies and searcher switched from Indexer 1 to Indexer > > 2, then when it finished downloading the segments and trying to refresh the > > reader, it will likely hit the exception here, and seems all we can do > > right now is to reload the whole index and that could be potentially a high > > cost. > > > > > > Sorry for the long email and thank you in advance for any replies! > > > > > > Best > > > Patrick > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org