Personally I would index to one server and replicate it out to the search servers on a small interval. Nrt is just a synonym for replicating the index as often as needed. This would provide consistent results and not require solr cloud at all.
> On Dec 18, 2025, at 11:18, Andrey Ukhanov (BLOOMBERG/ 919 3RD A) > <[email protected]> wrote: > > You can try tuning "autoCommit" (and "autoSoftCommit") to make the segment > fetching more frequent. Depending on what values those are currently set to, > it could help. But as with any change, best to test. > > From: [email protected] At: 12/17/25 17:04:16 UTC-5:00To: > [email protected] > Cc: [email protected] > Subject: Re: Search results consistency with vector search > > Thanks Andrey for your suggestion. We do need to support near real time > searches, and we do have frequent index updates , so I believe TLOG replicas > can't be used. Is there a way TLOG can support near real time searches , say > for example by tuning commit intervals. > > Regards, > Rajeswari > > On 12/17/25, 1:02 PM, "Andrey Ukhanov (BLOOMBERG/ 919 3RD A)" > <[email protected] <mailto:[email protected]>> wrote: > > > [You don't often get email from [email protected] > <mailto:[email protected]>. Learn why this is important at > https://aka.ms/LearnAboutSenderIdentification > <https://aka.ms/LearnAboutSenderIdentification> ] > > > In a Solr cloud with multiple NRT replicas the leader node will receive the > updates and distribute them to non-leader replicas. The important, and > relevant, aspect to highlight is that each NRT replica will then build/manage > segments individually. That means segment structure across replicas diverges. > HNSW graph is created per segment. Since segment structure is different across > replicas, it can lead to behavior you are observing where relevance results > differ across replicas. > To ensure consistency of results across replicas, segment structure needs to > be > the same. There are a few ways to accomplish this: > 1) Use TLOG (and PULL if applicable) replication. Unlike NRT, TLOG ensures > that > segment structure is the same across replicas. More on that here - > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards-inde > xing.html#types-of-replicas > <https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards- > indexing.html#types-of-replicas> > 2) If your index is static (doesn't change very often), you can explore using > optimize command or re-creating replicas from the leader. More on the here - > https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-update-ha > ndlers.html#commit-and-optimize-during-updates > <https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-updat > e-handlers.html#commit-and-optimize-during-updates> > Personally I would recommend option 1. > > > From: [email protected] <mailto:[email protected]> At: 12/17/25 > 14:56:04 UTC-5:00To: [email protected] <mailto:[email protected]> > Cc: [email protected] <mailto:[email protected]> > Subject: Re: Search results consistency with vector search > > > Thanks for your follow up , we are using NRT replicas > > > On 12/17/25, 11:01 AM, "Andrey Ukhanov (BLOOMBERG/ 919 3RD A)" > <[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > [You don't often get email from [email protected] > <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>. Learn why > this > is important at > https://aka.ms/LearnAboutSenderIdentification > <https://aka.ms/LearnAboutSenderIdentification> > <https://aka.ms/LearnAboutSenderIdentification> > <https://aka.ms/LearnAboutSenderIdentification>> ] > > > Hi Rajeswari, what replication model are you using in Solr? NRT or TLOG/PULL? > > > From: [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> At: 12/17/25 > 13:59:48 UTC-5:00To: [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > Subject: Search results consistency with vector search > > > Hi All, > > > Noticed that the vector search results for the same query is different each > time. Both ordering and the records are also different based on which replica > it hits. > > > All the replicas have same documents and all of them have same embeddings. > With > vector similarity parser with minReturn=0.8 , minTraversse=0.8 , the numFound > for specific query varies from 111 to 8 , which is a huge variation. > > > We are using solr 9.9 and lucene version 9.12.2. I believe this behavior due > to approximate HNSW construction in each replica. > > > Tried with minTraverseas 0.75 instead 0.8 , this fetches more records > (somewhere in 800s) he variations in numFound is less , but the ordering of > the records and even the record is different in this case also each time. > Is this expected ? . What can be done to get consistent results each time. > Please share your experiences. > > > Thanks, > Rajeswari > >
