> On Jul 15, 2021, at 8:28 AM, Shaurya Gupta <shaurya.n...@gmail.com> wrote: > > > Hi Jeff > > Thanks for a detailed answer. > > We are using Cassandra 3.11.2, I believe it would fall in the new version > category and IIUC should be able to handle the issue with anti-entropy repair > by adjusting merkel tree depth ?
Newest 3.11 but I don’t think 3.12.2 - just go to newest 3.11 > The cross dc read repair chance is disabled in our setup so we should be good. > >> Again, in 12x3, if one replica goes down beyond the hint window, when it >> comes up it's getting 35 copies of data, which is going to overwhelm it when >> it streams and compacts. > Thanks for pointing this out. IIUC this would also increase the bootstrap > time when the node comes back up. No bootstrap should be sane DC if you’re using a DC aware replication strategy and it’s one copy of each range anyway > Will it also affect the time taken for a replaced node to report UN No > ? AFAIK during node replacement a particular token range is streamed from > only one of the replicas, so the time taken for replacing a node should > remain the same ? > >> On Wed, Jul 14, 2021 at 8:00 PM Jeff Jirsa <jji...@gmail.com> wrote: >> Hi, >> >> So, there's two things where you'll see the impact of "lots of datacenters" >> >> On the query side, global quorum queries (and queries with cross-dc >> probabilistic read repair) may touch more DCs and be slower, and >> read-repairs during those queries get more expensive. Your geography matters >> a ton for latency, and your write consistency and network quality matters a >> ton for read repairs. During the read, the coordinator will track which >> replicas are mismatching, and build mutations to make them in sync - that >> buildup will accumulate more data if you're very out of sync. >> >> The other thing you should expect is different behavior during repairs. The >> anti-entropy repairs do pair-wise merkle trees. If you imagine 6, 8, 12 >> datacenters of 3 copies each, you've got 18, 24, 36 copies of data, each of >> those holds a merkle tree. The repair coordinator will have a lot more data >> in memory (adjusting the tree depth in newer versions, or using the offheap >> option in 4.0) starts removing the GC pressure on the coordinator in those >> types of topologies. In older versions, using subrange repair and lots of >> smaller ranges will avoid very deep trees and keep memory tolerable. ALSO, >> when you do have a mismatch, you're going to stream a LOT of data. Again, in >> 12x3, if one replica goes down beyond the hint window, when it comes up it's >> getting 35 copies of data, which is going to overwhelm it when it streams >> and compacts. CASSANDRA-3200 helps this in 4.0, and incremental repair helps >> this if you're running incremental repair (again, probably after >> CASSANDRA-9143 in 4.0), but the naive approach can lead to really bad >> surprises. >> >> >> >>> On Wed, Jul 14, 2021 at 7:17 AM Shaurya Gupta <shaurya.n...@gmail.com> >>> wrote: >>> Hi, Multiple DCs are required to maintain lower latencies for requests >>> across the globe. I agree that it's a lot of redundant copies of data. >>> >>>> On Wed, Jul 14, 2021, 7:00 PM Jim Shaw <jxys...@gmail.com> wrote: >>>> Shaurya: >>>> What's the purpose to partise too many data centers ? >>>> RF=3, is within a center, you have 3 copies of data. >>>> If you have 3 DCs, means 9 copies of data. >>>> Think about space wasted, Network bandwidth wasted for number of copies. >>>> BTW, Ours just 2 DCs for regional DR. >>>> >>>> Thanks, >>>> Jim >>>> >>>>> On Wed, Jul 14, 2021 at 2:27 AM Shaurya Gupta <shaurya.n...@gmail.com> >>>>> wrote: >>>>> Hi >>>>> >>>>> Does someone have any suggestions on the maximum number of Data Centers >>>>> which NetworkTopology strategy can have for a keyspace. Not only >>>>> technically but considering performance as well. >>>>> In each Data Center RF is 3. >>>>> >>>>> Thanks! >>>>> -- >>>>> Shaurya Gupta >>>>> >>>>> > > > -- > Shaurya Gupta > >