Why wouldnt the performance hit not happen for 8.3.1? On Wed, 27 Oct 2021, 20:07 Dominic Humphries, <[email protected]> wrote:
> At last, I think we've got it! > > Our external boost files live on an NFS volume so they can be updated once > by a worker machine and all the followers will get the update. Which is all > very nice. > > But if we instead source those files from the local filesystem instead of > one mounted from the network, the performance issue goes away! > > I've tested this manually and it looks good; I'm now in the process of > updating our terraform etc so the instances will be able to use local > copies of these files. Assuming the update works, the matter will finally > be fixed! > > So the reason we were seeing performance issues was that we were using > NFS-mounted external files to update our boosts - which is probably > edge-case enough to be why nobody else was reporting it! > > I'll update one last time to confirm all is well with the new images, and > hopefully this issue can be put to bed at last. > > Thanks all for your help! > > Dominic > > On Tue, 26 Oct 2021 at 15:31, Dominic Humphries <[email protected]> > wrote: > > > No problem, I've been trying to get my head around how it all works > myself! > > > > As per > > > https://solr.apache.org/guide/8_9/working-with-external-files-and-processes.html > > our schema defines a field type: > > <fieldType name="fileboost" keyField="id" defVal="1" stored="false" > > indexed="false" class="solr.ExternalFileField"/> > > which is then used to define a field: > > <field name="boostvalue" type="fileboost"/> > > which pulls data from a file, external_boostvalue, living > > in $SOLR_HOME/data > > > > This is used to set a boost value that increases the visibility of some > > search results. > > > > Setting this file to be empty completely removes the performance hit we > > see taking several minutes to resolve after each replication. But we do > > need the functionality still, and I'm unclear on why this is an issue for > > 8.9 when it wasn't for 8.3 > > > > Hope this clarifies the problem! > > > > Dominic > > > > On Mon, 25 Oct 2021 at 19:03, Charlie Hull < > > [email protected]> wrote: > > > >> Hi Dominic, > >> > >> Could you clarify what you mean by boost files in this context? Just > >> curious.... > >> > >> Charlie > >> > >> On 25/10/2021 17:11, Dominic Humphries wrote: > >> > Performance with the replica pulling from 8.3.1 was actually worse. > And > >> > looking at the data in the databases and the boost file contents, I'm > >> > dubious it's a problem of incompatible boost files. I think the > >> performance > >> > of importing/applying the boosts really is what's responsible for the > >> issue > >> > we see. Not sure what else to test to verify or disprove this.. > >> > > >> > On Mon, 25 Oct 2021 at 14:56, Dominic Humphries <[email protected]> > >> wrote: > >> > > >> >> I think I found it! > >> >> > >> >> I didn't realise, but we have boost files for the core I'm testing > and > >> the > >> >> boost is applied after replication! Setting the contents of the files > >> to > >> >> empty completely removes the post-replication performance problem we > >> were > >> >> seeing. > >> >> > >> >> So now my question becomes "Why is boosting taking so much longer for > >> the > >> >> upgrade?" > >> >> > >> >> Since the upgrade has its own independent set of data, I'm wondering > if > >> >> it's as simple as the IDs it's trying to boost don't exist and it > takes > >> >> longer to find out an item is missing than it does to find one that > >> does? I > >> >> believe I can point an 8.9.0 follower at an 8.3.1 leader, that seems > >> like > >> >> the next logical step - if there's no performance hit when it has the > >> same > >> >> data as the 8.3.1 replica, then that's almost certainly the problem. > >> >> > >> >> Fingers crossed! > >> >> > >> >> On Sun, 24 Oct 2021 at 10:26, Deepak Goel <[email protected]> wrote: > >> >> > >> >>> There could be some testing and cooling happening post-replication. > >> will > >> >>> have to dig a bit more into the code. > >> >>> > >> >>> Deepak > >> >>> "The greatness of a nation can be judged by the way its animals are > >> >>> treated > >> >>> - Mahatma Gandhi" > >> >>> > >> >>> +91 73500 12833 > >> >>> [email protected] > >> >>> > >> >>> Facebook: https://www.facebook.com/deicool > >> >>> LinkedIn: www.linkedin.com/in/deicool > >> >>> > >> >>> "Plant a Tree, Go Green" > >> >>> > >> >>> Make In India : http://www.makeinindia.com/home > >> >>> > >> >>> > >> >>> On Thu, Oct 21, 2021 at 9:57 PM Dominic Humphries > >> >>> <[email protected]> wrote: > >> >>> > >> >>>> One more tidbit: I just tried leaving replication off for a few > hours > >> >>> and > >> >>>> then triggering a "big" replication run so I could see the distinct > >> >>> stages. > >> >>>> > >> >>>> - Beginning replication didn't cause any performance > degradation. > >> >>>> - Several minutes of downloading the replication files saw no > >> >>>> degradation > >> >>>> - Only after downloading had completed did we start to see > >> >>> performance > >> >>>> issues in our tests > >> >>>> - But we saw the "number of docs/timestamp of latest file" both > >> jump > >> >>>> almost immediately after downloading completed and never move > >> again > >> >>>> - But the performance degradation continued for about seven > more > >> >>> minutes > >> >>>> even though replication was clearly finished at this point > >> >>>> > >> >>>> > >> >>>> Is there some kind of re-indexing optimization thing that solr can > >> run > >> >>>> post-replication? At this point it's about my only remaining > >> suspect.. > >> >>>> > >> > >> -- > >> Charlie Hull - Managing Consultant at OpenSource Connections Limited > >> <www.o19s.com> > >> Founding member of The Search Network <https://thesearchnetwork.com/> > >> and co-author of Searching the Enterprise > >> <https://opensourceconnections.com/about-us/books-resources/> > >> tel/fax: +44 (0)8700 118334 > >> mobile: +44 (0)7767 825828 > >> > >> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin > >> Amtsgericht Charlottenburg | HRB 230712 B > >> Geschäftsführer: John M. Woodell | David E. Pugh > >> Finanzamt: Berlin Finanzamt für Körperschaften II > >> > > >
