Why wouldnt the performance hit not happen for 8.3.1?

On Wed, 27 Oct 2021, 20:07 Dominic Humphries, <[email protected]>
wrote:

> At last, I think we've got it!
>
> Our external boost files live on an NFS volume so they can be updated once
> by a worker machine and all the followers will get the update. Which is all
> very nice.
>
> But if we instead source those files from the local filesystem instead of
> one mounted from the network, the performance issue goes away!
>
> I've tested this manually and it looks good; I'm now in the process of
> updating our terraform etc so the instances will be able to use local
> copies of these files. Assuming the update works, the matter will finally
> be fixed!
>
> So the reason we were seeing performance issues was that we were using
> NFS-mounted external files to update our boosts - which is probably
> edge-case enough to be why nobody else was reporting it!
>
> I'll update one last time to confirm all is well with the new images, and
> hopefully this issue can be put to bed at last.
>
> Thanks all for your help!
>
> Dominic
>
> On Tue, 26 Oct 2021 at 15:31, Dominic Humphries <[email protected]>
> wrote:
>
> > No problem, I've been trying to get my head around how it all works
> myself!
> >
> > As per
> >
> https://solr.apache.org/guide/8_9/working-with-external-files-and-processes.html
> > our schema defines a field type:
> >     <fieldType name="fileboost" keyField="id" defVal="1" stored="false"
> > indexed="false" class="solr.ExternalFileField"/>
> > which is then used to define a field:
> >     <field name="boostvalue" type="fileboost"/>
> > which pulls data from a file, external_boostvalue, living
> > in $SOLR_HOME/data
> >
> > This is used to set a boost value that increases the visibility of some
> > search results.
> >
> > Setting this file to be empty completely removes the performance hit we
> > see taking several minutes to resolve after each replication. But we do
> > need the functionality still, and I'm unclear on why this is an issue for
> > 8.9 when it wasn't for 8.3
> >
> > Hope this clarifies the problem!
> >
> > Dominic
> >
> > On Mon, 25 Oct 2021 at 19:03, Charlie Hull <
> > [email protected]> wrote:
> >
> >> Hi Dominic,
> >>
> >> Could you clarify what you mean by boost files in this context? Just
> >> curious....
> >>
> >> Charlie
> >>
> >> On 25/10/2021 17:11, Dominic Humphries wrote:
> >> > Performance with the replica pulling from 8.3.1 was actually worse.
> And
> >> > looking at the data in the databases and the boost file contents, I'm
> >> > dubious it's a problem of incompatible boost files. I think the
> >> performance
> >> > of importing/applying the boosts really is what's responsible for the
> >> issue
> >> > we see. Not sure what else to test to verify or disprove this..
> >> >
> >> > On Mon, 25 Oct 2021 at 14:56, Dominic Humphries <[email protected]>
> >> wrote:
> >> >
> >> >> I think I found it!
> >> >>
> >> >> I didn't realise, but we have boost files for the core I'm testing
> and
> >> the
> >> >> boost is applied after replication! Setting the contents of the files
> >> to
> >> >> empty completely removes the post-replication performance problem we
> >> were
> >> >> seeing.
> >> >>
> >> >> So now my question becomes "Why is boosting taking so much longer for
> >> the
> >> >> upgrade?"
> >> >>
> >> >> Since the upgrade has its own independent set of data, I'm wondering
> if
> >> >> it's as simple as the IDs it's trying to boost don't exist and it
> takes
> >> >> longer to find out an item is missing than it does to find one that
> >> does? I
> >> >> believe I can point an 8.9.0 follower at an 8.3.1 leader, that seems
> >> like
> >> >> the next logical step - if there's no performance hit when it has the
> >> same
> >> >> data as the 8.3.1 replica, then that's almost certainly the problem.
> >> >>
> >> >> Fingers crossed!
> >> >>
> >> >> On Sun, 24 Oct 2021 at 10:26, Deepak Goel <[email protected]> wrote:
> >> >>
> >> >>> There could be some testing and cooling happening post-replication.
> >> will
> >> >>> have to dig a bit more into the code.
> >> >>>
> >> >>> Deepak
> >> >>> "The greatness of a nation can be judged by the way its animals are
> >> >>> treated
> >> >>> - Mahatma Gandhi"
> >> >>>
> >> >>> +91 73500 12833
> >> >>> [email protected]
> >> >>>
> >> >>> Facebook: https://www.facebook.com/deicool
> >> >>> LinkedIn: www.linkedin.com/in/deicool
> >> >>>
> >> >>> "Plant a Tree, Go Green"
> >> >>>
> >> >>> Make In India : http://www.makeinindia.com/home
> >> >>>
> >> >>>
> >> >>> On Thu, Oct 21, 2021 at 9:57 PM Dominic Humphries
> >> >>> <[email protected]> wrote:
> >> >>>
> >> >>>> One more tidbit: I just tried leaving replication off for a few
> hours
> >> >>> and
> >> >>>> then triggering a "big" replication run so I could see the distinct
> >> >>> stages.
> >> >>>>
> >> >>>>     - Beginning replication didn't cause any performance
> degradation.
> >> >>>>     - Several minutes of downloading the replication files saw no
> >> >>>> degradation
> >> >>>>     - Only after downloading had completed did we start to see
> >> >>> performance
> >> >>>>     issues in our tests
> >> >>>>     - But we saw the "number of docs/timestamp of latest file" both
> >> jump
> >> >>>>     almost immediately after downloading completed and never move
> >> again
> >> >>>>     - But the performance degradation continued for about seven
> more
> >> >>> minutes
> >> >>>>     even though replication was clearly finished at this point
> >> >>>>
> >> >>>>
> >> >>>> Is there some kind of re-indexing optimization thing that solr can
> >> run
> >> >>>> post-replication? At this point it's about my only remaining
> >> suspect..
> >> >>>>
> >>
> >> --
> >> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >> <www.o19s.com>
> >> Founding member of The Search Network <https://thesearchnetwork.com/>
> >> and co-author of Searching the Enterprise
> >> <https://opensourceconnections.com/about-us/books-resources/>
> >> tel/fax: +44 (0)8700 118334
> >> mobile: +44 (0)7767 825828
> >>
> >> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> >> Amtsgericht Charlottenburg | HRB 230712 B
> >> Geschäftsführer: John M. Woodell | David E. Pugh
> >> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>
> >
>

Reply via email to