Hi All, I’m currently dealing with a use case that is running on around 200 nodes, due to growth of their product as well as onboarding additional data sources, we are looking at having to expand that to around 700 nodes, and potentially beyond to 1000+. To that end I have a couple of questions:
1) For those who have experienced managing clusters at that scale, what types of operational challenges have you run into that you might not see when operating 100 node clusters? A couple that come to mind are version (especially major version) upgrades become a lot more risky as it no longer becomes feasible to do a blue / green style deployment of the database and backup & restore operations seem far more error prone as well for the same reason (having to do an in-place restore instead of being able to spin up a new cluster to restore to). 2) Is there a cluster size beyond which sharding across multiple clusters becomes the recommended approach? Thanks, Isaac