On Thu, Mar 10, 2011 at 6:06 AM, Jedd Rashbrooke <j...@visualdna.com> wrote: > My question is whether it's considered safer to upgrade via 0.6.12 > to 0.7, or if a direct 0.6.6 -> 0.7 upgrade is safe enough?
You don't need latest 0.6 before upgrading. > Copying a cluster between AWS DC's: > We have ~ 150-250GB per node, with a Replication Factor of 4. > I ack that 0.6 -> 0.7 is necessarily STW, so in an attempt to > minimise that outage period I was wondering if it's possible to > drain & stop the cluster, then copy over only the 1st, 5th, 9th, > and 13th nodes' worth of data (which should be a full copy of > all our actual data - we are nicely partitioned, despite the > disparity in GB per node) and have Cassandra re-populate the > new destination 16 nodes from those four data sets. If this is > feasible, is it likely to be more expensive (in terms of time the > new cluster is unresponsive as it rebuilds) than just copying > across all 16 sets of data - about 2.7TB. I'm confused. You're trying to upgrade and add a DC at the same time? > Chattiness / gossip traffic requirements on DC-aware: > I haven't pondered deeply on a 7 design yet, so this question is > even more nebulous. We're seeing growth (raw) of about 100GB > per month on our 16 node RF4 cluster - say about 25GB of 'actual' > data growth. We don't delete (much) data. Amazon's calculator > suggests even 100GB in/out of a data center is modestly priced, > but I'm cautious in case the replication traffic is particularly chatty > or excessive. And how expensive (in terms of traffic) a compaction > or repair would be across data centers. Compactions are node-local. Normal writes are optimized for the WAN (only one copy will be sent between DCs; the recipient in the other DC will then forward it to other replicas there). Repairs is not yet WAN-optimized but is still cheap if your replicas are close to consistent since only merkle trees + inconsistent ranges are sent over the network. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com