You really only have four 1> use exactstats. This won't guarantee precise matches, but they'll be closer 2> optimize (not particularly recommended, but if you're willing to do it periodically it'll have the stats match until the next updates). 3> use TLOG/PULL replicas and confine the requests to the PULL replicas. There'll _still_ be some window for mismatches, specifically the default is commit_interval/2 4> define the problem away.
Best, Erick On Tue, Feb 12, 2019 at 2:42 AM Aman Tandon <amantandon...@gmail.com> wrote: > > Hi Erick, > > Any suggestions on this? > > Regards, > Aman > > On Fri, Feb 8, 2019, 17:07 Aman Tandon <amantandon...@gmail.com wrote: > > > Hi Erick, > > > > I find this thread very relevant to the people who are facing the same > > problem. > > > > In our case, we have a signals aggregation collection which is having > > total of around 8 million records. We have Solr cloud architecture(3 shards > > and 4 replicas) and the whole size of index is of around 2.5 GB. > > > > We use this collection to fetch the most clicked products against a query > > and boost in search results. Boost score is the query score on aggregation > > collection. > > > > But when the query goes to different replica we get different boost score > > for some of the keywords, hence on page refresh results ordering keep on > > changing. > > > > In order to solve we tried the exactstats cache for distributed IDF and on > > debug level I am seeing global stats merge in logs but still the different > > scores coming on refreshing the results from aggregation collection. > > > > Our indexing occur once a day so should we do daily optimization or should > > we reduce merge segment count to 2/3 currently it is -1. > > > > What are your suggestions on this? > > > > Regards, > > Aman > > > > On Fri, Feb 8, 2019, 00:15 Erick Erickson <erickerick...@gmail.com wrote: > > > >> Optimization is safe. The large segment is irrelevant, you'll > >> lose a little parallelization, but on an index with this few > >> documents I doubt you'll notice. > >> > >> As of Solr 5, optimize will respect the max segment size > >> which defaults to 5G, but you're well under that limit. > >> > >> Best, > >> Erick > >> > >> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht <bishtashis...@gmail.com> > >> wrote: > >> > > >> > Thanks Erick and everyone.We are checking on stats cache. > >> > > >> > I noticed stats skew again and optimized the index to correct the > >> same.As > >> > per the documents. > >> > > >> > > >> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ > >> > and > >> > > >> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/ > >> > > >> > wanted to check on below points considering we want stats skew to be > >> > corrected. > >> > > >> > 1.When optimized single segment won't be natural merged easily.As we > >> might > >> > be doing manual optimize every time,what I visualize is at a certain > >> point > >> > in future we might be having a single large segment.What impact this > >> large > >> > segment is going to have? > >> > Our index ~30k documents i.e files with content(Segment size <1Gb as of > >> now) > >> > > >> > 1.Do you recommend going for optimize in these situations?Probably it > >> will > >> > be done only when stats skew.Is it safe? > >> > > >> > Regards > >> > Ashish > >> > > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html > >> > >