HI, I am indexing documents to a 10 shard collection (testcollection, having no replicas) in solr6 cluster using CloudSolrClient. I saw that there is a lot of peer to peer document distribution going on when I looked at the solr logs.
An example log statement is as follows: 2017-06-01 06:07:28.378 INFO (qtp1358444045-3673692) [c:testcollection s:shard8 r:core_node7 x:testcollection_shard8_replica1] o.a.s.u.p.LogUpdateProcessorFactory [testcollection_shard8_replica1] webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from= http://10.199.42.29:8983/solr/testcollection_shard7_replica1/&wt=javabin&version=2}{add=[BQECDwZGTCEBHZZBBiIP (1568981383488995328), BQEBBQZB2il3wGT/0/mB (1568981383490043904), BQEBBQZFnhOJRj+m9RJC (1568981383491092480), BQEGBgZIeBE1klHS4fxk (1568981383492141056), BQEBBQZFVTmRx2VuCgfV (1568981383493189632)]} 0 25 When I went through the code of CloudSolrClient on grepcode I saw that the client itself finds out which server it needs to hit by using the message id hash and getting the shard range information from state.json. Then it is quite confusing to me why there is a distribution of data between peers as there is no replication and each shard is a leader. I would like to know why this is happening and how to avoid it or if the above log statement means something else and I am misinterpreting something. -- Sathyam Doraswamy