We've started to use Cassandra in production and just have one node right
now. Here's one of our ColumnFamilys:


16G Jan 28 22:28 SomeIndex-5467-Index.db
196M Jan 28 22:32 SomeIndex-5487-Index.db

The first bottle neck you encounter is reads--writes are extremely
fast even with one node.

My question is, is the size of the *-Index.db files the amount of RAM
you need available for Cassandra to do reads fast?

What are some configuration options you would need to tweak besides
the JVM's max memory size being larger. Is there any default
configurations commonly missed?

Next, if you provision more nodes will Cassandra distribute the data
in memory so I don't need a single 16 GB node? Is there anything I
need to build in my application logic to make this work correctly.
Ideally, if I had a 16 GB index, I'd want it spread across 4 4GB
nodes. Can any client connect to any one node request info and it will
get the info back from a node that has that part of the index in
memory?

What's the best way to do efficient reads?

Suhail

Reply via email to