Hello, I have a usecase with the following characteristics:
- High index update rate (adds/updates) - High query rate - Low index size (~800MB for 2.4Million docs) - The documents that are created at the high rate eventually "expire" and are deleted regularly at half hour intervals I currently have a solr cloud set up with 1 shard and 4 replicas. * My index updates are sent to a VIP/loadbalancer (round robins to one of the 4 solr nodes) * I am using http client to send the updates * Using batch size of 100 and 8 to 10 threads sending the batch of updates to solr. When I try to run tests to scale out the indexing rate, I see the following: * solr nodes go into recovery * updates are taking really long to complete. As I understand, when a node receives an update: * If it is the leader, it forwards the update to all the replicas and waits until it receives the reply from all of them before replying back to the client that sent the reply. * If it is not the leader, it forwards the update to the leader, which THEN does the above steps mentioned. How do I go about scaling the index updates: * As I add more replicas, my updates would get slower and slower? * Is there a way I can configure the leader to wait for say N out of M replicas only? * Should I be targeting the updates to only the leader? * Any other approach i should be considering? Thanks Vinay