Hi,

I’ve been looking into CouchDB 2’s clustering works, and have a question about 
performance characteristics of sharding.

As I understand it, when you first configure a cluster you pick the number of 
shards. Each node has to have at least one shard on it, and each shard can be 
duplicated N times, where N best practice is considered 2-3x. How many shards 
you have is decided on DB creation time, and if you need more later (because 
maxnodes = shards * replicas) you need to replicate into a new cluster.

I’m wondering if:
 - There is a recommended number of shards to use, or a recommended range to 
stay in
 - If there is any known performance characteristics that map to how many 
shards you have. e.g., how differently would a one node “cluster” perform with 
2 shards compared to 16 or even 256. Is there any harm in configuring your 
cluster with 16 shards say, even if you aren’t planning on having 16-32 nodes 
any time soon.

While replicating into a larger cluster is safe, we have a lot (thousands) of 
PouchDB clients who would have to re-download the entire changes feed to 
continue replicating. As these clients are on spotty / slow / expensive 
connections, so ideally we’d be able to get the number of shards right first 
time.

Stefan

Reply via email to