Hi,

I'd like to learn how to set up a Brisk cluster with HA/DR in Amazon. Last
time I tried this a few months ago, it was tricky because we had to either
set up a VPN or hack the Cassandra source to get internode communications to
work across regions. But with v 0.8's new BriskSnitch or EC2Snitch, I'm
hoping it'll be a smooth process.

Here is the setup I'd like to set up:

Amazon East (DC 1)
- 4 vanilla Cassandra nodes in Availability Zone A (1 replica)
- 4 vanilla Cassandra nodes in Availability Zone B (1 replica)

Amazon West (DC 2)
- 4 Brisk nodes in Availability Zone A (1 replica)

The West Brisk nodes will be reserved for MapReduce analytics. All
reads/writes will go to the 8 Amazon East nodes. There will be 3 replicas.
We're planning on sending reads and writes to the East nodes with
consistency level of Quorum. So, as long as the cluster is healthy, writes
should be committed to 2 nodes in east and then a write successful will be
sent back to the client. In the background a 3rd write will be sent with
higher latency to the West brisk cluster.

We'll use the NetworkTopologyStrategy when we create the keyspace:
create keyspace MyCoolKeyspace
    with placement_strategy =
'org.apache.cassandra.locator.NetworkTopologyStrategy'
    and strategy_options=[{DC1:2, DC2:1}];



Questions:

1) Must I use the Brisk binaries across all 12 nodes? Or can I use the
Cassandra 0.8 binary for the 8 nodes in Amazon East and Brisk1.0 in West?

2) Is this the correct token distribution if I plan on using Random
Partitioner? If I remember correctly, when configuring Cassandra across data
centers, each DC get's it's own ring.

Amazon East
node 0: 0 (also Cassandra seed)
node 1: 21267647932558653966460912964485513216
node 2: 42535295865117307932921825928971026432
node 3: 63802943797675961899382738893456539648
node 4: 85070591730234615865843651857942052864
node 5: 106338239662793269832304564822427566080
node 6: 127605887595351923798765477786913079296
node 7: 148873535527910577765226390751398592512

Amazon West (I added one to each one of these so that there are no duplicate
tokens)
node 0: 1 (also Brisk seed)
node 1: 42535295865117307932921825928971026433
node 2: 85070591730234615865843651857942052865
node 3: 127605887595351923798765477786913079297

3) Should I be using the BriskSnitch or EC2Snitch? Can BriskSnitch traverse
across Amazon regions?

4) Is it correct that the Seeds will have to be defined in the YAML file
using Elastic IPs? For the listen address, can I use the Private IP on all
12 nodes?

Reply via email to