Hi Tomaso, It's definitely something I've pondered on occasion but I'm left wondering (a) is it worth it (experimentation will tell), and (b) perhaps Lucene doesn't need anything new here: see MultiReader. Arguably this can be handled at the search server layer by constructing multiple IndexWriters and then a MultiReader over their collective indexes. Perhaps a special IndexSearcher QueryCache could be developed to partition itself on the separate underlying readers. Of course it would probably take a lot of work to retrofit, say Solr, to do this but I'm dubious Lucene should be saddled with unneeded complexity for this.
On Thu, Oct 12, 2017 at 9:55 AM Tommaso Teofili <[email protected]> wrote: > Hi all, > > having been involved in such kind of challenge and having seen a few more > similar enquiries on the dev list, I was wondering if it may be time to > think about making it possible to have an explicit (customizable and > therefore pluggable) policy which allows people to chime into where > documents and / or segments get written (on write or on merge). > Recently there was someone asking about possibly having segments sorted by > a field using SortingMergePolicy, but as Uwe noted it's currently an > implementation detail. Personally I have tried (and failed because it was > too costly) to make sure docs belonging to certain clusters (identified by > a field) being written within same segments (for data locality / memory > footprint concerns when "loading" docs from a certain cluster). > > As of today that'd be *really* hard, but I just wanted to share my feeling > that such topic might be something to keep an eye on. > > My 2 cents, > Tommaso > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
