Thanks Reid and Jon! Yes I will stick with one rack per DC for sure and I will look at the Vnodes problem later on.
What's the difference in terms of reliability between A) spreading 2 Datacenters across 3 AZ B) having 2 Datacenters in 2 separate AZ ? Best, Sergio On Thu, Oct 24, 2019, 7:36 AM Reid Pinchback <rpinchb...@tripadvisor.com> wrote: > Hey Sergio, > > > > Forgive but I’m at work and had to skim the info quickly. > > > > When in doubt, simplify. So 1 rack per DC. Distributed systems get > rapidly harder to reason about the more complicated you make them. There’s > more than enough to learn about C* without jumping into the complexity too > soon. > > > > To deal with the unbalancing issue, pay attention to Jon Haddad’s advice > on vnode count and how to fairly distribute tokens with a small vnode > count. I’d rather point you to his information, as I haven’t dug into > vnode counts and token distribution in detail; he’s got a lot more time in > C* than I do. I come at this more as a traditional RDBMS and Java guy who > has slowly gotten up to speed on C* over the last few years, and dealt with > DynamoDB a lot so have lived with a lot of similarity in data modelling > concerns. Detailed internals I only know in cases where I had reason to > dig into C* source. > > > > There are so many knobs to turn in C* that it can be very easy to > overthink things. Simplify where you can. Remove GC pressure wherever you > can. Negotiate with your consumers to have data models that make sense for > C*. If you have those three criteria foremost in mind, you’ll likely be > fine for quite some time. And in the times where something isn’t going > well, simpler is easier to investigate. > > > R > > > > *From: *Sergio <lapostadiser...@gmail.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Wednesday, October 23, 2019 at 3:34 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations > > > > *Message from External Sender* > > Hi Reid, > > Thank you very much for clearing these concepts for me. > https://community.datastax.com/comments/1133/view.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=> > I posted this question on the datastax forum regarding our cluster that it > is unbalanced and the reply was related that the *number of racks should > be a multiplier of the replication factor *in order to be balanced or 1. > I thought then if I have 3 availability zones I should have 3 racks for > each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or > in the easiest way, I should have a rack for each datacenter. > > > > 1. Datacenter: live > ================ > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID > Rack > UN 10.1.20.49 289.75 GiB 256 ? > be5a0193-56e7-4d42-8cc8-5d2141ab4872 us-east-1a > UN 10.1.30.112 103.03 GiB 256 ? > e5108a8e-cc2f-4914-a86e-fccf770e3f0f us-east-1b > UN 10.1.19.163 129.61 GiB 256 ? > 3c2efdda-8dd4-4f08-b991-9aff062a5388 us-east-1a > UN 10.1.26.181 145.28 GiB 256 ? > 0a8f07ba-a129-42b0-b73a-df649bd076ef us-east-1b > UN 10.1.17.213 149.04 GiB 256 ? > 71563e86-b2ae-4d2c-91c5-49aa08386f67 us-east-1a > DN 10.1.19.198 52.41 GiB 256 ? > 613b43c0-0688-4b86-994c-dc772b6fb8d2 us-east-1b > UN 10.1.31.60 195.17 GiB 256 ? > 3647fcca-688a-4851-ab15-df36819910f4 us-east-1b > UN 10.1.25.206 100.67 GiB 256 ? > f43532ad-7d2e-4480-a9ce-2529b47f823d us-east-1b > So each rack label right now matches the availability zone and we have 3 > Datacenters and 2 Availability Zone with 2 racks per DC but the above is > clearly unbalanced > If I have a keyspace with a replication factor = 3 and I want to minimize > the number of nodes to scale up and down the cluster and keep it balanced > should I consider an approach like OPTION A) > > 2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a > > 3. 3 read ONE us-east-1a > > 4. 4 write ONE us-east-1b 5 write ONE us-east-1b > > 5. 6 write ONE us-east-1b > > 6. OPTION B) > > 7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a > > 8. 3 read ONE us-east-1a > > 9. 4 write TWO us-east-1b 5 write TWO us-east-1b > > 10.6 write TWO us-east-1b > > 11.*7 read ONE us-east-1c 8 write TWO us-east-1c* > > 12.*9 read ONE us-east-1c* Option B looks to be unbalanced and I would > exclude it OPTION C) > > 13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b > > 14.3 read ONE us-east-1c > > 15.4 write TWO us-east-1a 5 write TWO us-east-1b > > 16.6 write TWO us-east-1c > > 17. > > so I am thinking of A if I have the restriction of 2 AZ but I guess that > option C would be the best. If I have to add another DC for reads because > we want to assign a new DC for each new microservice it would look like: > > OPTION EXTRA DC For Reads > > 1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b > > 2. 3 read ONE us-east-1c > > 3. 4 write TWO us-east-1a 5 write TWO us-east-1b > > 4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a > > 5. 8 extra-read THREE us-east-1b > > 6. > > 7. > > 1. 9 extra-read THREE us-east-1c > > 2. > > The DC for *write* will replicate the data in the other datacenters. My > scope is to keep the *read* machines dedicated to serve reads and *write* > machines to serve writes. Cassandra will handle the replication for me. Is > there any other option that is I missing or wrong assumption? I am thinking > that I will write a blog post about all my learnings so far, thank you very > much for the replies Best, Sergio > > > > Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback < > rpinchb...@tripadvisor.com> ha scritto: > > No, that’s not correct. The point of racks is to help you distribute the > replicas, not further-replicate the replicas. Data centers are what do the > latter. So for example, if you wanted to be able to ensure that you always > had quorum if an AZ went down, then you could have two DCs where one was in > each AZ, and use one rack in each DC. In your situation I think I’d be > more tempted to consider that. Then if an AZ went away, you could fail > over your traffic to the remaining DC and still be perfectly fine. > > > > For background on replicas vs racks, I believe the information you want is > under the heading ‘NetworkTopologyStrategy’ at: > > http://cassandra.apache.org/doc/latest/architecture/dynamo.html > <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=> > > > > That should help you better understand how replicas distribute. > > > > As mentioned before, while you can choose to do the reads in one DC, > except for concerns about contention related to network traffic and > connection handling, you can’t isolate reads from writes. You can _ > *mostly*_ insulate the write DC from the activity within the read DC, and > even that isn’t an absolute because of repairs. However, your mileage may > vary, so do what makes sense for your usage pattern. > > > > R > > > > *From: *Sergio <lapostadiser...@gmail.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Wednesday, October 23, 2019 at 12:50 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations > > > > *Message from External Sender* > > Hi Reid, > > Thanks for your reply. I really appreciate your explanation. > > We are in AWS and we are using right now 2 Availability Zone and not 3. We > found our cluster really unbalanced because the keyspace has a replication > factor = 3 and the number of racks is 2 with 2 datacenters. > We want the writes spread across all the nodes but we wanted the reads > isolated from the writes to keep the load on that node low and to be able > to identify problems in the consumers (reads) or producers (writes) > applications. > It looks like that each rack contains an entire copy of the data so this > would lead to replicate for each rack and then for each node the > information. If I am correct if we have a keyspace with 100GB and > Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB > If I had only one rack across 2 or even 3 availability zone I would save > in space and I would have 300GB only. Please correct me if I am wrong. > > Best, > > Sergio > > > > Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback < > rpinchb...@tripadvisor.com> ha scritto: > > Datacenters and racks are different concepts. While they don't have to be > associated with their historical meanings, the historical meanings probably > provide a helpful model for understanding what you want from them. > > When companies own their own physical servers and have them housed > somewhere, the questions arise on where you want to locate any particular > server. It's a balancing act on things like network speed of related > servers being able to talk to each other, versus fault-tolerance of having > many servers not all exposed to the same risks. > > "Same rack" in that physical world tended to mean something like "all > behind the same network switch and all sharing the same power bus". The > morning after an electrical glitch fries a power bus and thus everything in > that rack, you realize you wished you didn't have so many of the same type > of server together. Well, they were servers. Now they are door stops. > Badness and sadness. > > That's kind of the mindset to have in mind with racks in Cassandra. It's > an artifact for you to separate servers into pools so that the disparate > pools have hopefully somewhat independent infrastructure risks. However, > all those servers are still doing the same kind of work, are the same > version, etc. > > Datacenters are amalgams of those racks, and how similar or different they > are from each other depends on what you want to do with them. What is true > is that if you have N datacenters, each one of them must have enough disk > storage to house all the data. The actual physical footprint of that data > in each DC depends on the replication factors in play. > > Note that you sorta can't have "one datacenter for writes" because the > writes will replicate across the data centers. You could definitely choose > to have only one that takes read queries, but best to think of writing as > being universal. One scenario you can have is where the DC not taking live > traffic read queries is the one you use for maintenance or performance > testing or version upgrades. > > One rack makes your life easier if you don't have a reason for multiple > racks. It depends on the environment you deploy into and your fault > tolerance goals. If you were in AWS and wanting to spread risk across > availability zones, then you would likely have as many racks as AZs you > choose to be in, because that's really the point of using multiple AZs. > > R > > > On 10/23/19, 4:06 AM, "Sergio Bilello" <lapostadiser...@gmail.com> wrote: > > Message from External Sender > > Hello guys! > > I was reading about > https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e= > > I would like to understand a concept related to the node load > balancing. > > I know that Jon recommends Vnodes = 4 but right now I found a cluster > with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced > because the racks are not a multiplier of the replication factor. > > However, my plan is to move all the nodes in a single rack to > eventually scale up and down the node in the cluster once at the time. > > If I had 3 racks and I would like to keep the things balanced I should > scale up 3 nodes at the time one for each rack. > > If I would have 3 racks, should I have also 3 different datacenters so > one datacenter for each rack? > > Can I have 2 datacenters and 3 racks? If this is possible one > datacenter would have more nodes than the others? Could it be a problem? > > I am thinking to split my cluster in one datacenter for reads and one > for writes and keep all the nodes in the same rack so I can scale up once > node at the time. > > > > Please correct me if I am wrong > > > > Thanks, > > > > Sergio > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: user-h...@cassandra.apache.org > > > >