Thanks Anthony!  I’ve made a note to include that information in the 
documentation. You’re right.  It won’t work as intended unless that is 
configured properly.

 

I’m also favoring a couple other guidelines for Slender Cassandra:

1.       SSD’s only, no spinning disks

2.       At least two cores per node

 

For AWS, I’m favoring the c3.large on Linux.  It’s available in these regions: 
US-East, US-West and US-West2.  The specifications are listed as:

·         Two (2) vCPU’s

·         3.7 Gib Memory

·         Two (2) 16 GB SSD’s

·         Moderate I/O

 

It’s going to be hard to beat the inexpensive cost of operating a Slender 
Cluster on demand in the cloud – and it fits a lot of the use cases well:  

 

·         For under a $100 a month, in current pricing for EC2 instances, you 
can operate an eighteen (18) node Slender Cluster for five (5) hours a day, ten 
(10) days a month.  That’s fine for demonstrations, teaching or experiments 
that last half a day or less. 

·         For under $20, you can have that Slender Cluster up all day long, up 
to ten (10) hours, for whatever demonstrations or experiments you want it for.

 

As always, feedback is encouraged.

 

Thanks,

 

Kenneth Brotman

 

From: Anthony Grasso [mailto:anthony.gra...@gmail.com] 
Sent: Sunday, January 21, 2018 3:57 PM
To: user
Subject: Re: Slender Cassandra Cluster Project

 

Hi Kenneth,

 

Fantastic idea!

 

One thing that came to mind from my reading of the proposed setup was rack 
awareness of each node. Given that the proposed setup contains three DCs, I 
assume that each node will be made rack aware? If not, consider defining three 
racks for each DC and placing two nodes in each rack. This will ensure that all 
the nodes in a single rack contain at most one replica of the data.

 

Regards,

Anthony

 

On 17 January 2018 at 11:24, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Sure.  That takes the project from awesome to 10X awesome.  I absolutely would 
be willing to do that.  Thanks Kurt!

 

Regarding your comment on the keyspaces, I agree.  There should be a few simple 
examples one way or the other that can be duplicated and observed, and then an 
example to duplicate and play with that has a nice real world mix, with some 
keyspaces that replicate over only a subset of DC’s and some that replicate to 
all DC’s.

 

Kenneth Brotman 

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Tuesday, January 16, 2018 1:31 PM
To: User
Subject: Re: Slender Cassandra Cluster Project

 

Sounds like a great idea. Probably would be valuable to add to the official 
docs as an example set up if you're willing.

 

Only thing I'd add is that you should have keyspaces that replicate over only a 
subset of DC's, plus one/some replicated to all DC's

 

On 17 Jan. 2018 03:26, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

I’ve begun working on a reference project intended to provide guidance on 
configuring and operating a modest Cassandra cluster of about 18 nodes suitable 
for the economic study, demonstration, experimentation and testing of a 
Cassandra cluster.

 

The slender cluster would be designed to be as inexpensive as possible while 
still using real world hardware in order to lower the cost to those with 
limited initial resources. Sorry no Raspberry Pi’s for this project.  

 

There would be an on-premises version and a cloud version.  Guidance would be 
provided on configuring the cluster, on demonstrating key Cassandra behaviors, 
on files sizes, capacity to use with the Slender Cassandra Cluster, and so on.

 

Why about eighteen nodes? I tried to figure out what the minimum number of 
nodes needed for Cassandra to be Cassandra is?  Here were my considerations:

 

•             A user wouldn’t run Cassandra in just one data center; so at 
least two datacenters.

•             A user probably would want a third data center available for 
analytics.

•             There needs to be enough nodes for enough parallelism to observe 
Cassandra’s distributed nature.

•             The cluster should have enough nodes that one gets a sense of the 
need for cluster wide management tools to do things like repairs, snapshots and 
cluster monitoring.

•             The cluster should be able to demonstrate a RF=3 with local 
quorum.  If replicated in all three data centers, one write would impact half 
the 18 nodes, 3 datacenters X 3 nodes per data center = 9 nodes of 18 nodes  If 
replicated in two of the data centers, one write would still impact one third 
of the 18 nodes, 2 DC’s X 3 nodes per DC = 6 of the 18 nodes.  

 

So eighteen seems like the minimum number of nodes needed.  That’s six nodes in 
each of three data centers.

 

Before I get too carried away with this project, I’m looking for some feedback 
on whether this project would indeed be helpful to others? Also, should the 
project be changed in any way?

 

It’s always a pleasure to connect with the Cassandra users’ community.  Thanks 
for all the hard work, the expertise, the civil dialog.

 

Kenneth Brotman

 

Reply via email to