Re: Cassandra Rack - Datacenter Load Balancing relations

Sergio Thu, 24 Oct 2019 14:42:22 -0700

Thanks Reid!

I agree with all the things that you said!


Best,
Sergio

Il giorno gio 24 ott 2019 alle ore 09:25 Reid Pinchback <
rpinchb...@tripadvisor.com> ha scritto:

> Two different AWS AZs are in two different physical locations.  Typically
> different cities.  Which means that you’re trying to manage the risk of an
> AZ going dark, so you use more than one AZ just in case.  The downside is
> that you will have some degree of network performance difference between
> AZs because of whatever WAN pipe AWS owns/leased to connect between them.
>
>
>
> Having a DC in one AZ is easy to reason about.  The AZ is there, or it is
> not.  If you have two DCs in your cluster, and you lose an AZ, it means you
> still have a functioning cluster with one DC and you still have quorum.
> Yay, even in an outage, you know you can still do business.  You would only
> have to route any traffic normally sent to the other DC to the remaining
> one, so as long as there is resource headroom planning in how you provision
> your hardware, you’re in a safe state.
>
>
>
> If you start splitting a DC across AZs without using racks to organize
> nodes on a per-AZ basis, off the top of my head I don’t know how you reason
> about your risks for losing quorum without pausing to really think through
> vnodes and token distribution and whatnot.  I’m not a fan of topologies I
> can’t reason about when paged at 3 in the morning and I’m half asleep.  I
> prefer simple until the workload motivates complex.
>
>
>
> R
>
>
>
>
>
> *From: *Sergio <lapostadiser...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Thursday, October 24, 2019 at 12:06 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Thanks Reid and Jon!
>
>
>
> Yes I will stick with one rack per DC for sure and I will look at the
> Vnodes problem later on.
>
>
>
>
>
> What's the difference in terms of reliability between
>
> A) spreading 2 Datacenters across 3 AZ
>
> B) having 2 Datacenters in 2 separate AZ
>
> ?
>
>
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Thu, Oct 24, 2019, 7:36 AM Reid Pinchback <rpinchb...@tripadvisor.com>
> wrote:
>
> Hey Sergio,
>
>
>
> Forgive but I’m at work and had to skim the info quickly.
>
>
>
> When in doubt, simplify.  So 1 rack per DC.  Distributed systems get
> rapidly harder to reason about the more complicated you make them.  There’s
> more than enough to learn about C* without jumping into the complexity too
> soon.
>
>
>
> To deal with the unbalancing issue, pay attention to Jon Haddad’s advice
> on vnode count and how to fairly distribute tokens with a small vnode
> count.  I’d rather point you to his information, as I haven’t dug into
> vnode counts and token distribution in detail; he’s got a lot more time in
> C* than I do.  I come at this more as a traditional RDBMS and Java guy who
> has slowly gotten up to speed on C* over the last few years, and dealt with
> DynamoDB a lot so have lived with a lot of similarity in data modelling
> concerns.  Detailed internals I only know in cases where I had reason to
> dig into C* source.
>
>
>
> There are so many knobs to turn in C* that it can be very easy to
> overthink things.  Simplify where you can.  Remove GC pressure wherever you
> can.  Negotiate with your consumers to have data models that make sense for
> C*.  If you have those three criteria foremost in mind, you’ll likely be
> fine for quite some time.  And in the times where something isn’t going
> well, simpler is easier to investigate.
>
>
> R
>
>
>
> *From: *Sergio <lapostadiser...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 3:34 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thank you very much for clearing these concepts for me.
> https://community.datastax.com/comments/1133/view.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=RSwuSea6HjOb3gChVS_i4GnKgl--H0q-VHz38_setfc&e=>
> I posted this question on the datastax forum regarding our cluster that it
> is unbalanced and the reply was related that the *number of racks should
> be a multiplier of the replication factor *in order to be balanced or 1.
> I thought then if I have 3 availability zones I should have 3 racks for
> each datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or
> in the easiest way, I should have a rack for each datacenter.
>
>
> 1.  Datacenter: live
> ================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens       Owns    Host ID
>             Rack
> UN  10.1.20.49   289.75 GiB  256          ?
> be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
> UN  10.1.30.112  103.03 GiB  256          ?
> e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
> UN  10.1.19.163  129.61 GiB  256          ?
> 3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
> UN  10.1.26.181  145.28 GiB  256          ?
> 0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
> UN  10.1.17.213  149.04 GiB  256          ?
> 71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
> DN  10.1.19.198  52.41 GiB  256          ?
> 613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
> UN  10.1.31.60   195.17 GiB  256          ?
> 3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
> UN  10.1.25.206  100.67 GiB  256          ?
> f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
> So each rack label right now matches the availability zone and we have 3
> Datacenters and 2 Availability Zone with 2 racks per DC but the above is
> clearly unbalanced
> If I have a keyspace with a replication factor = 3 and I want to minimize
> the number of nodes to scale up and down the cluster and keep it balanced
> should I consider an approach like OPTION A)
>
> 2.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 3.  3 read ONE us-east-1a
>
> 4.  4 write ONE us-east-1b 5 write ONE us-east-1b
>
> 5.  6 write ONE us-east-1b
>
> 6.  OPTION B)
>
> 7.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>
> 8.  3 read ONE us-east-1a
>
> 9.  4 write TWO us-east-1b 5 write TWO us-east-1b
>
> 10.6 write TWO us-east-1b
>
> 11.*7 read ONE us-east-1c 8 write TWO us-east-1c*
>
> 12.*9 read ONE us-east-1c* Option B looks to be unbalanced and I would
> exclude it OPTION C)
>
> 13.Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 14.3 read ONE us-east-1c
>
> 15.4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 16.6 write TWO us-east-1c
>
> 17.
>
> so I am thinking of A if I have the restriction of 2 AZ but I guess that
> option C would be the best. If I have to add another DC for reads because
> we want to assign a new DC for each new microservice it would look like:
>
> OPTION EXTRA DC For Reads
>
> 1.  Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>
> 2.  3 read ONE us-east-1c
>
> 3.  4 write TWO us-east-1a 5 write TWO us-east-1b
>
> 4.  6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>
> 5.  8 extra-read THREE us-east-1b
>
> 6.
>
> 7.
>
> 1.  9 extra-read THREE us-east-1c
>
> 2.
>
> The DC for *write* will replicate the data in the other datacenters. My
> scope is to keep the *read* machines dedicated to serve reads and *write*
> machines to serve writes. Cassandra will handle the replication for me. Is
> there any other option that is I missing or wrong assumption? I am thinking
> that I will write a blog post about all my learnings so far, thank you very
> much for the replies Best, Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
> rpinchb...@tripadvisor.com> ha scritto:
>
> No, that’s not correct.  The point of racks is to help you distribute the
> replicas, not further-replicate the replicas.  Data centers are what do the
> latter.  So for example, if you wanted to be able to ensure that you always
> had quorum if an AZ went down, then you could have two DCs where one was in
> each AZ, and use one rack in each DC.  In your situation I think I’d be
> more tempted to consider that.  Then if an AZ went away, you could fail
> over your traffic to the remaining DC and still be perfectly fine.
>
>
>
> For background on replicas vs racks, I believe the information you want is
> under the heading ‘NetworkTopologyStrategy’ at:
>
> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=hcKr__B8MyXvYx8vQx20B_KN89ZynwB-N4px87tcYY8&s=BhioPylf2Zs5ocBSiSQX--IeP2ojSoTiaq66SXbYN6w&e=>
>
>
>
> That should help you better understand how replicas distribute.
>
>
>
> As mentioned before, while you can choose to do the reads in one DC,
> except for concerns about contention related to network traffic and
> connection handling, you can’t isolate reads from writes.  You can _
> *mostly*_ insulate the write DC from the activity within the read DC, and
> even that isn’t an absolute because of repairs.  However, your mileage may
> vary, so do what makes sense for your usage pattern.
>
>
>
> R
>
>
>
> *From: *Sergio <lapostadiser...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Wednesday, October 23, 2019 at 12:50 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi Reid,
>
> Thanks for your reply. I really appreciate your explanation.
>
> We are in AWS and we are using right now 2 Availability Zone and not 3. We
> found our cluster really unbalanced because the keyspace has a replication
> factor = 3 and the number of racks is 2 with 2 datacenters.
> We want the writes spread across all the nodes but we wanted the reads
> isolated from the writes to keep the load on that node low and to be able
> to identify problems in the consumers (reads) or producers (writes)
> applications.
> It looks like that each rack contains an entire copy of the data so this
> would lead to replicate for each rack and then for each node the
> information. If I am correct if we have  a keyspace with 100GB and
> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
> If I had only one rack across 2 or even 3 availability zone I would save
> in space and I would have 300GB only. Please correct me if I am wrong.
>
> Best,
>
> Sergio
>
>
>
> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
> rpinchb...@tripadvisor.com> ha scritto:
>
> Datacenters and racks are different concepts.  While they don't have to be
> associated with their historical meanings, the historical meanings probably
> provide a helpful model for understanding what you want from them.
>
> When companies own their own physical servers and have them housed
> somewhere, the questions arise on where you want to locate any particular
> server.  It's a balancing act on things like network speed of related
> servers being able to talk to each other, versus fault-tolerance of having
> many servers not all exposed to the same risks.
>
> "Same rack" in that physical world tended to mean something like "all
> behind the same network switch and all sharing the same power bus".  The
> morning after an electrical glitch fries a power bus and thus everything in
> that rack, you realize you wished you didn't have so many of the same type
> of server together.  Well, they were servers.  Now they are door stops.
> Badness and sadness.
>
> That's kind of the mindset to have in mind with racks in Cassandra.  It's
> an artifact for you to separate servers into pools so that the disparate
> pools have hopefully somewhat independent infrastructure risks.  However,
> all those servers are still doing the same kind of work, are the same
> version, etc.
>
> Datacenters are amalgams of those racks, and how similar or different they
> are from each other depends on what you want to do with them.  What is true
> is that if you have N datacenters, each one of them must have enough disk
> storage to house all the data.  The actual physical footprint of that data
> in each DC depends on the replication factors in play.
>
> Note that you sorta can't have "one datacenter for writes" because the
> writes will replicate across the data centers.  You could definitely choose
> to have only one that takes read queries, but best to think of writing as
> being universal.  One scenario you can have is where the DC not taking live
> traffic read queries is the one you use for maintenance or performance
> testing or version upgrades.
>
> One rack makes your life easier if you don't have a reason for multiple
> racks. It depends on the environment you deploy into and your fault
> tolerance goals.  If you were in AWS and wanting to spread risk across
> availability zones, then you would likely have as many racks as AZs you
> choose to be in, because that's really the point of using multiple AZs.
>
> R
>
>
> On 10/23/19, 4:06 AM, "Sergio Bilello" <lapostadiser...@gmail.com> wrote:
>
>      Message from External Sender
>
>     Hello guys!
>
>     I was reading about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>
>     I would like to understand a concept related to the node load
> balancing.
>
>     I know that Jon recommends Vnodes = 4 but right now I found a cluster
> with vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced
> because the racks are not a multiplier of the replication factor.
>
>     However, my plan is to move all the nodes in a single rack to
> eventually scale up and down the node in the cluster once at the time.
>
>     If I had 3 racks and I would like to keep the things balanced I should
> scale up 3 nodes at the time one for each rack.
>
>     If I would have 3 racks, should I have also 3 different datacenters so
> one datacenter for each rack?
>
>     Can I have 2 datacenters and 3 racks? If this is possible one
> datacenter would have more nodes than the others? Could it be a problem?
>
>     I am thinking to split my cluster in one datacenter for reads and one
> for writes and keep all the nodes in the same rack so I can scale up once
> node at the time.
>
>
>
>     Please correct me if I am wrong
>
>
>
>     Thanks,
>
>
>
>     Sergio
>
>
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>
>     For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>

Re: Cassandra Rack - Datacenter Load Balancing relations

Reply via email to