You can try tuning "autoCommit" (and "autoSoftCommit") to make the segment fetching more frequent. Depending on what values those are currently set to, it could help. But as with any change, best to test.
From: [email protected] At: 12/17/25 17:04:16 UTC-5:00To: [email protected] Cc: [email protected] Subject: Re: Search results consistency with vector search Thanks Andrey for your suggestion. We do need to support near real time searches, and we do have frequent index updates , so I believe TLOG replicas can't be used. Is there a way TLOG can support near real time searches , say for example by tuning commit intervals. Regards, Rajeswari On 12/17/25, 1:02 PM, "Andrey Ukhanov (BLOOMBERG/ 919 3RD A)" <[email protected] <mailto:[email protected]>> wrote: [You don't often get email from [email protected] <mailto:[email protected]>. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification <https://aka.ms/LearnAboutSenderIdentification> ] In a Solr cloud with multiple NRT replicas the leader node will receive the updates and distribute them to non-leader replicas. The important, and relevant, aspect to highlight is that each NRT replica will then build/manage segments individually. That means segment structure across replicas diverges. HNSW graph is created per segment. Since segment structure is different across replicas, it can lead to behavior you are observing where relevance results differ across replicas. To ensure consistency of results across replicas, segment structure needs to be the same. There are a few ways to accomplish this: 1) Use TLOG (and PULL if applicable) replication. Unlike NRT, TLOG ensures that segment structure is the same across replicas. More on that here - https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards-inde xing.html#types-of-replicas <https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-shards- indexing.html#types-of-replicas> 2) If your index is static (doesn't change very often), you can explore using optimize command or re-creating replicas from the leader. More on the here - https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-update-ha ndlers.html#commit-and-optimize-during-updates <https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-updat e-handlers.html#commit-and-optimize-during-updates> Personally I would recommend option 1. From: [email protected] <mailto:[email protected]> At: 12/17/25 14:56:04 UTC-5:00To: [email protected] <mailto:[email protected]> Cc: [email protected] <mailto:[email protected]> Subject: Re: Search results consistency with vector search Thanks for your follow up , we are using NRT replicas On 12/17/25, 11:01 AM, "Andrey Ukhanov (BLOOMBERG/ 919 3RD A)" <[email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>> wrote: [You don't often get email from [email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification <https://aka.ms/LearnAboutSenderIdentification> <https://aka.ms/LearnAboutSenderIdentification> <https://aka.ms/LearnAboutSenderIdentification>> ] Hi Rajeswari, what replication model are you using in Solr? NRT or TLOG/PULL? From: [email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>> At: 12/17/25 13:59:48 UTC-5:00To: [email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>> Cc: [email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>> Subject: Search results consistency with vector search Hi All, Noticed that the vector search results for the same query is different each time. Both ordering and the records are also different based on which replica it hits. All the replicas have same documents and all of them have same embeddings. With vector similarity parser with minReturn=0.8 , minTraversse=0.8 , the numFound for specific query varies from 111 to 8 , which is a huge variation. We are using solr 9.9 and lucene version 9.12.2. I believe this behavior due to approximate HNSW construction in each replica. Tried with minTraverseas 0.75 instead 0.8 , this fetches more records (somewhere in 800s) he variations in numFound is less , but the ordering of the records and even the record is different in this case also each time. Is this expected ? . What can be done to get consistent results each time. Please share your experiences. Thanks, Rajeswari
