Re: Deterministic index construction

2020-12-19 Thread Haoyu Zhai
Hi Adrien I think Mike's comment is correct, we already have index sorted but we want to reconstruct a index with exact same number of segments and each segment contains exact same documents. Mike AddIndexes could take CodecReader as input [1], which allows us to pass in a customized

Re: 8.8 Release

2020-12-19 Thread Bruno Roustant
+1 Thanks for volunteering Le ven. 18 déc. 2020 à 01:41, Ishan Chattopadhyaya < ichattopadhy...@gmail.com> a écrit : > Sure, Houston. I'll wait another week. Have a good new year and merry > Christmas! > > On Fri, 18 Dec, 2020, 5:58 am Timothy Potter, > wrote: > >> Great point Houston! +1 on

Re: Deterministic index construction

2020-12-19 Thread Michael Sokolov
I don't know about addIndexes. Does that let you say which document goes where somehow? Wouldn't you have to select a subset of documents from each originally indexed segment? On Sat, Dec 19, 2020, 12:11 PM Michael Sokolov wrote: > I think the idea is to exert control over the distribution of

Re: Deterministic index construction

2020-12-19 Thread Michael Sokolov
I think the idea is to exert control over the distribution of documents among the segments, in a deterministic reproducible way. On Sat, Dec 19, 2020, 11:39 AM Adrien Grand wrote: > Have you considered leveraging Lucene's built-in index sorting? It > supports concurrent indexing and is quite

Re: Deterministic index construction

2020-12-19 Thread Adrien Grand
Have you considered leveraging Lucene's built-in index sorting? It supports concurrent indexing and is quite fast. On Fri, Dec 18, 2020 at 7:26 PM Haoyu Zhai wrote: > Hi > Our team is seeking a way of construct (or rebuild) a deterministic sorted > index concurrently (I know lucene could