Thanks Erick for your suggestions and time. On Tue, Feb 12, 2019, 22:32 Erick Erickson <erickerick...@gmail.com wrote:
> You really only have four > 1> use exactstats. This won't guarantee precise matches, but they'll be > closer > 2> optimize (not particularly recommended, but if you're willing to do > it periodically it'll have the stats match until the next updates). > 3> use TLOG/PULL replicas and confine the requests to the PULL > replicas. There'll _still_ be some window for mismatches, > specifically the default is commit_interval/2 > 4> define the problem away. > > Best, > Erick > > On Tue, Feb 12, 2019 at 2:42 AM Aman Tandon <amantandon...@gmail.com> > wrote: > > > > Hi Erick, > > > > Any suggestions on this? > > > > Regards, > > Aman > > > > On Fri, Feb 8, 2019, 17:07 Aman Tandon <amantandon...@gmail.com wrote: > > > > > Hi Erick, > > > > > > I find this thread very relevant to the people who are facing the same > > > problem. > > > > > > In our case, we have a signals aggregation collection which is having > > > total of around 8 million records. We have Solr cloud architecture(3 > shards > > > and 4 replicas) and the whole size of index is of around 2.5 GB. > > > > > > We use this collection to fetch the most clicked products against a > query > > > and boost in search results. Boost score is the query score on > aggregation > > > collection. > > > > > > But when the query goes to different replica we get different boost > score > > > for some of the keywords, hence on page refresh results ordering keep > on > > > changing. > > > > > > In order to solve we tried the exactstats cache for distributed IDF > and on > > > debug level I am seeing global stats merge in logs but still the > different > > > scores coming on refreshing the results from aggregation collection. > > > > > > Our indexing occur once a day so should we do daily optimization or > should > > > we reduce merge segment count to 2/3 currently it is -1. > > > > > > What are your suggestions on this? > > > > > > Regards, > > > Aman > > > > > > On Fri, Feb 8, 2019, 00:15 Erick Erickson <erickerick...@gmail.com > wrote: > > > > > >> Optimization is safe. The large segment is irrelevant, you'll > > >> lose a little parallelization, but on an index with this few > > >> documents I doubt you'll notice. > > >> > > >> As of Solr 5, optimize will respect the max segment size > > >> which defaults to 5G, but you're well under that limit. > > >> > > >> Best, > > >> Erick > > >> > > >> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht <bishtashis...@gmail.com > > > > >> wrote: > > >> > > > >> > Thanks Erick and everyone.We are checking on stats cache. > > >> > > > >> > I noticed stats skew again and optimized the index to correct the > > >> same.As > > >> > per the documents. > > >> > > > >> > > > >> > https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ > > >> > and > > >> > > > >> > https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/ > > >> > > > >> > wanted to check on below points considering we want stats skew to be > > >> > corrected. > > >> > > > >> > 1.When optimized single segment won't be natural merged easily.As we > > >> might > > >> > be doing manual optimize every time,what I visualize is at a certain > > >> point > > >> > in future we might be having a single large segment.What impact this > > >> large > > >> > segment is going to have? > > >> > Our index ~30k documents i.e files with content(Segment size <1Gb > as of > > >> now) > > >> > > > >> > 1.Do you recommend going for optimize in these situations?Probably > it > > >> will > > >> > be done only when stats skew.Is it safe? > > >> > > > >> > Regards > > >> > Ashish > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > -- > > >> > Sent from: > http://lucene.472066.n3.nabble.com/Solr-User-f472068.html > > >> > > > >