On Sep 18, 2009, at 9:55 PM, Jonathan Ellis wrote:
On Fri, Sep 18, 2009 at 9:09 PM, Jonathan Mischo <jmis...@quagility.com
> wrote:
• Multiple data center replication in the background.
maybe a
multi master type thing
It already has this. It was built from the ground up for this.
It's highly
tolerant to partitioning and has always available writes. All
replication is
done in the background (unless you specifically set a write to a
high
consistency level).
You know, it does and it doesn't. RackAwareStrategy isn't a true N+1
scaling solution. Currently, RackAwareStrategy only guarantees that
it will
try to replicate data to one other data center and/or one other rack,
depending on the number of replicas specified.
Yes; that's what it's supposed to do, and it's satisfying a very real
use case: "I want my data's primary data center to be DC A, but I want
one replica in DC B in case A is completely unavailable."
Other use cases can use different Strategies. That's why they're
pluggable. It's not one-size-fits-all and it's not supposed to be.
Yeah, you're right, if N+1 is a concern, it should probably be a
separate strategy, unless we can keep the complexity virtually the
same, because of how heavily it's called. RackAwareStrategy is
perfectly fine for what it does - guarantee a replica in a different
DC and/or a replica in a different rack after that, if you configure
it to store more than 1 replica. Above 3 replicas, it can start to get
unbalanced, though, since it's just iterating through the node list,
which really has no value. We could probably just document that for
RackAwareStrategy.
I know we're trying to solve for the biggest wins for effort, but, as
the Cassandra user base grows (and it will, because it fills a niche
that no other KVS or RDBMS quite fills), I think N+1 capability is
going to be something that will need to be solved for fairly soon for
widespread adoption.
-Jon