Hi Erick,

I find this thread very relevant to the people who are facing the same
problem.

In our case, we have a signals aggregation collection which is having total
of around 8 million records. We have Solr cloud architecture(3 shards and 4
replicas) and the whole size of index is of around 2.5 GB.

We use this collection to fetch the most clicked products against a query
and boost in search results. Boost score is the query score on aggregation
collection.

But when the query goes to different replica we get different boost score
for some of the keywords, hence on page refresh results ordering keep on
changing.

In order to solve we tried the exactstats cache for distributed IDF and on
debug level I am seeing global stats merge in logs but still the different
scores coming on refreshing the results from aggregation collection.

Our indexing occur once a day so should we do daily optimization or should
we reduce merge segment count to 2/3 currently it is -1.

What are your suggestions on this?

Regards,
Aman

On Fri, Feb 8, 2019, 00:15 Erick Erickson <erickerick...@gmail.com wrote:

> Optimization is safe. The large segment is irrelevant, you'll
> lose a little parallelization, but on an index with this few
> documents I doubt you'll notice.
>
> As of Solr 5, optimize will respect the max segment size
> which defaults to 5G, but you're well under that limit.
>
> Best,
> Erick
>
> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht <bishtashis...@gmail.com>
> wrote:
> >
> > Thanks Erick and everyone.We are checking on stats cache.
> >
> > I noticed stats skew again and optimized the index to correct the same.As
> > per the documents.
> >
> >
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
> > and
> >
> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
> >
> > wanted to check on below points considering we want stats skew to be
> > corrected.
> >
> > 1.When optimized single segment won't be natural merged easily.As we
> might
> > be doing manual optimize every time,what I visualize is at a certain
> point
> > in future we might be having a single large segment.What impact this
> large
> > segment is going to have?
> > Our index ~30k documents i.e files with content(Segment size <1Gb as of
> now)
> >
> > 1.Do you recommend going for optimize in these situations?Probably it
> will
> > be done only when stats skew.Is it safe?
> >
> > Regards
> > Ashish
> >
> >
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Reply via email to