Hi, I’ve been looking into CouchDB 2’s clustering works, and have a question about performance characteristics of sharding.
As I understand it, when you first configure a cluster you pick the number of shards. Each node has to have at least one shard on it, and each shard can be duplicated N times, where N best practice is considered 2-3x. How many shards you have is decided on DB creation time, and if you need more later (because maxnodes = shards * replicas) you need to replicate into a new cluster. I’m wondering if: - There is a recommended number of shards to use, or a recommended range to stay in - If there is any known performance characteristics that map to how many shards you have. e.g., how differently would a one node “cluster” perform with 2 shards compared to 16 or even 256. Is there any harm in configuring your cluster with 16 shards say, even if you aren’t planning on having 16-32 nodes any time soon. While replicating into a larger cluster is safe, we have a lot (thousands) of PouchDB clients who would have to re-download the entire changes feed to continue replicating. As these clients are on spotty / slow / expensive connections, so ideally we’d be able to get the number of shards right first time. Stefan
