Very good question, for which I currently have no answer!

On Wed, 27 Oct 2021 at 17:15, Deepak Goel <[email protected]> wrote:

> Why wouldnt the performance hit not happen for 8.3.1?
>
> On Wed, 27 Oct 2021, 20:07 Dominic Humphries, <[email protected]>
> wrote:
>
> > At last, I think we've got it!
> >
> > Our external boost files live on an NFS volume so they can be updated
> once
> > by a worker machine and all the followers will get the update. Which is
> all
> > very nice.
> >
> > But if we instead source those files from the local filesystem instead of
> > one mounted from the network, the performance issue goes away!
> >
> > I've tested this manually and it looks good; I'm now in the process of
> > updating our terraform etc so the instances will be able to use local
> > copies of these files. Assuming the update works, the matter will finally
> > be fixed!
> >
> > So the reason we were seeing performance issues was that we were using
> > NFS-mounted external files to update our boosts - which is probably
> > edge-case enough to be why nobody else was reporting it!
> >
> > I'll update one last time to confirm all is well with the new images, and
> > hopefully this issue can be put to bed at last.
> >
> > Thanks all for your help!
> >
> > Dominic
> >
> > On Tue, 26 Oct 2021 at 15:31, Dominic Humphries <[email protected]>
> > wrote:
> >
> > > No problem, I've been trying to get my head around how it all works
> > myself!
> > >
> > > As per
> > >
> >
> https://solr.apache.org/guide/8_9/working-with-external-files-and-processes.html
> > > our schema defines a field type:
> > >     <fieldType name="fileboost" keyField="id" defVal="1" stored="false"
> > > indexed="false" class="solr.ExternalFileField"/>
> > > which is then used to define a field:
> > >     <field name="boostvalue" type="fileboost"/>
> > > which pulls data from a file, external_boostvalue, living
> > > in $SOLR_HOME/data
> > >
> > > This is used to set a boost value that increases the visibility of some
> > > search results.
> > >
> > > Setting this file to be empty completely removes the performance hit we
> > > see taking several minutes to resolve after each replication. But we do
> > > need the functionality still, and I'm unclear on why this is an issue
> for
> > > 8.9 when it wasn't for 8.3
> > >
> > > Hope this clarifies the problem!
> > >
> > > Dominic
> > >
> > > On Mon, 25 Oct 2021 at 19:03, Charlie Hull <
> > > [email protected]> wrote:
> > >
> > >> Hi Dominic,
> > >>
> > >> Could you clarify what you mean by boost files in this context? Just
> > >> curious....
> > >>
> > >> Charlie
> > >>
> > >> On 25/10/2021 17:11, Dominic Humphries wrote:
> > >> > Performance with the replica pulling from 8.3.1 was actually worse.
> > And
> > >> > looking at the data in the databases and the boost file contents,
> I'm
> > >> > dubious it's a problem of incompatible boost files. I think the
> > >> performance
> > >> > of importing/applying the boosts really is what's responsible for
> the
> > >> issue
> > >> > we see. Not sure what else to test to verify or disprove this..
> > >> >
> > >> > On Mon, 25 Oct 2021 at 14:56, Dominic Humphries <[email protected]
> >
> > >> wrote:
> > >> >
> > >> >> I think I found it!
> > >> >>
> > >> >> I didn't realise, but we have boost files for the core I'm testing
> > and
> > >> the
> > >> >> boost is applied after replication! Setting the contents of the
> files
> > >> to
> > >> >> empty completely removes the post-replication performance problem
> we
> > >> were
> > >> >> seeing.
> > >> >>
> > >> >> So now my question becomes "Why is boosting taking so much longer
> for
> > >> the
> > >> >> upgrade?"
> > >> >>
> > >> >> Since the upgrade has its own independent set of data, I'm
> wondering
> > if
> > >> >> it's as simple as the IDs it's trying to boost don't exist and it
> > takes
> > >> >> longer to find out an item is missing than it does to find one that
> > >> does? I
> > >> >> believe I can point an 8.9.0 follower at an 8.3.1 leader, that
> seems
> > >> like
> > >> >> the next logical step - if there's no performance hit when it has
> the
> > >> same
> > >> >> data as the 8.3.1 replica, then that's almost certainly the
> problem.
> > >> >>
> > >> >> Fingers crossed!
> > >> >>
> > >> >> On Sun, 24 Oct 2021 at 10:26, Deepak Goel <[email protected]>
> wrote:
> > >> >>
> > >> >>> There could be some testing and cooling happening
> post-replication.
> > >> will
> > >> >>> have to dig a bit more into the code.
> > >> >>>
> > >> >>> Deepak
> > >> >>> "The greatness of a nation can be judged by the way its animals
> are
> > >> >>> treated
> > >> >>> - Mahatma Gandhi"
> > >> >>>
> > >> >>> +91 73500 12833
> > >> >>> [email protected]
> > >> >>>
> > >> >>> Facebook: https://www.facebook.com/deicool
> > >> >>> LinkedIn: www.linkedin.com/in/deicool
> > >> >>>
> > >> >>> "Plant a Tree, Go Green"
> > >> >>>
> > >> >>> Make In India : http://www.makeinindia.com/home
> > >> >>>
> > >> >>>
> > >> >>> On Thu, Oct 21, 2021 at 9:57 PM Dominic Humphries
> > >> >>> <[email protected]> wrote:
> > >> >>>
> > >> >>>> One more tidbit: I just tried leaving replication off for a few
> > hours
> > >> >>> and
> > >> >>>> then triggering a "big" replication run so I could see the
> distinct
> > >> >>> stages.
> > >> >>>>
> > >> >>>>     - Beginning replication didn't cause any performance
> > degradation.
> > >> >>>>     - Several minutes of downloading the replication files saw no
> > >> >>>> degradation
> > >> >>>>     - Only after downloading had completed did we start to see
> > >> >>> performance
> > >> >>>>     issues in our tests
> > >> >>>>     - But we saw the "number of docs/timestamp of latest file"
> both
> > >> jump
> > >> >>>>     almost immediately after downloading completed and never move
> > >> again
> > >> >>>>     - But the performance degradation continued for about seven
> > more
> > >> >>> minutes
> > >> >>>>     even though replication was clearly finished at this point
> > >> >>>>
> > >> >>>>
> > >> >>>> Is there some kind of re-indexing optimization thing that solr
> can
> > >> run
> > >> >>>> post-replication? At this point it's about my only remaining
> > >> suspect..
> > >> >>>>
> > >>
> > >> --
> > >> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > >> <www.o19s.com>
> > >> Founding member of The Search Network <https://thesearchnetwork.com/>
> > >> and co-author of Searching the Enterprise
> > >> <https://opensourceconnections.com/about-us/books-resources/>
> > >> tel/fax: +44 (0)8700 118334
> > >> mobile: +44 (0)7767 825828
> > >>
> > >> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> > >> Amtsgericht Charlottenburg | HRB 230712 B
> > >> Geschäftsführer: John M. Woodell | David E. Pugh
> > >> Finanzamt: Berlin Finanzamt für Körperschaften II
> > >>
> > >
> >
>

Reply via email to