Hi All, Pinging again for assistance. This is a very unusual case, which is ruining user experience for a particular type of search [searches mapped in the replica facing timeouts] as these requests are taking more than 3 seconds.
On Wed, Jul 17, 2024 at 11:37 AM Saksham Gupta <saksham.gu...@indiamart.com> wrote: > Hi All, > > We are using a solr cloud cluster of 59 shards [1 replica for each shard] > spread across 8 nodes. We have used implicit routing for indexing and > searching data across these shards. > > Upon analyzing the timeouts on solr, we have found that more than 85% > [3097/3693 timeouts on 9th July] of the solr timeouts were happening due to > just 1 replica where the the size of the replica is more compared to other > replica [other replica contain < 5gb of data, whereas this replica contains > 10 gb]. > > 1. Anyone who faced a similar issue, how to mitigate this? Is there a way > to increase timeout for a particular replica/ node? > > 2. Also, has someone tried to further divide a shards' data into multiple > shards? How can we plan this, as there is already a logical separation > [implicit routing] b/w the 59 shards, and we will be adding another logic > to subdivide data for 1 of the shards. >