Hi,

I have been tasked with picking and setting up a database with the following 
characteristics:

• Ultra-high availability - The real requirement is uptime - our whole platform 
becomes inaccessible without a “read” from the database. We need the read to 
authenticate users. Databases will never be spread across multiple networks.
• Reasonably quick access speeds
• Very low data storage - The data storage is very low - for 10 million users, 
we would have around 8GB of storage total.

Having done a bit of research on Cassandra, I think the optimal approach for my 
use-case would be to replicate the data on ALL nodes possible, but require 
reads to only have a consistency level of one. So, in the case that a node goes 
down, we can still read/write to other nodes. It is not very important that a 
read be unanimously agreed upon, as long as Cassandra is eventually consistent, 
within around 1s, then there shouldn’t be an issue.

When I go to set up the database though, I am required to set a replication 
factor to a number - 1,2,3,etc. So I can’t just say “ALL” and have it replicate 
to all nodes. Right now, I have a 2 node cluster with replication factor 3. 
Will this cause any issues, having a RF > #nodes? Or is there a way to just 
have it copy to all nodes? Is there any way that I can tune Cassandra to be 
more read-optimized?

Finally, I have some misgivings about how well Cassandra fits my use-case. 
Please, if anyone has a suggestion as to why or why not it is a good fit, I 
would really appreciate your input! If this could be done with a simple SQL 
database and this is overkill, please let me know.

Thanks for your input!

Reply via email to