Hi Toke, Thank you for following up. Reading back, I surely could have explained better. Thanks for asking again.
>> What is a cluster? Is it a fully separate SolrCloud? Yes, by cluster I mean a fully separate SolrCloud. >> If so, does that mean you can divide your collection into (at least) 4 independent parts, where the indexing flow and the clients knows which cluster to use? So we can divide the documents across 4 SolrClouds each with multiple nodes. The clients would know which SolrCloud to index to. So the answer to your question is yes. >> Can it be divided further? For the sake of maintainability and ease of configuration, we wouldn't want to go beyond 4 SolrClouds. So at this point I would say no. But open to ideas if you think it would be greatly advantageous. So if we go with the 3rd configuration option we would be roughly indexing 1 billion documents (with an analyzed 'content' field possibly containing large text) per SolrCloud. Also I later got to know additional configurations and updated hardware specs, so let me revise that. We would index with a replication factor of 2. Hence each SolrCloud would have 4x2=8 nodes and 1 billion x 2 =2 billion documents indexed (with an analyzed 'content' field possibly containing large text). We would have up to 12 GB heap space allocated per node. By node I mean an individual Solr instance running on a certain port. Hence to break down the specs : For each SolrCloud: 8 nodes, each with 12 GB heap for Solr. Each node hosting 16 replicas (cores). 2 billion documents (replication factor=2. So 1 billion unique documents) Would SolrCloud scale well with the given configuration for a moderate-heavy indexing and search load ? Additional consideration: We have 4 beefy physical servers at disposal for this deployment. If we go with 4 SolrClouds then we would have 4x8=32 nodes (Solr instances) running across these 4 physical servers. Any issues that you might see with this configuration or additional considerations that I might be missing? Thanks, Rahul On Sat, Jun 29, 2019 at 1:13 PM Toke Eskildsen <t...@kb.dk> wrote: > Rahul Goswami <rahul196...@gmail.com> wrote: > > We are running Solr 7.2.1 and planning for a deployment which will grow > to > > 4 billion documents over time. We have 16 nodes at disposal.I am thinking > > between 3 configurations: > > > > 1 cluster - 16 nodes > > vs > > 2 clusters - 8 nodes each > > vs > > 4 clusters -4 nodes each > > You haven't got any answers. Maybe because it is a bit unclear what you're > asking. What is a cluster? Is it a fully separate SolrCloud? If so, does > that mean you can divide your collection into (at least) 4 independent > parts, where the indexing flow and the clients knows which cluster to use? > Can it be divided further? > > - Toke Eskildsen >