On Fri, Jan 14, 2011 at 4:29 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Here's some slides I did last year that have a simple explanation of RF 
> http://www.slideshare.net/mobile/aaronmorton/well-railedcassandra24112010-5901169
>
> Short version is, generally no single node contains all the data in the db.
> Normally the RF is going to be less than the number of nodes, and the higher 
> the rf the number of concurrent node failure you can handle (when writing at 
> Quorum).
>
> - at rf3 you can keep reading and writing with 1 node down. If you lose a 
> second node the cluster will appear to be down for a portion of the keys. The 
> portion depends on the total number of nodes.
> - at rf 5 the cluster will be up for all keys if you have 2 nodes down. If 
> you have 3 down the cluster will appear down for only a portion of the keys, 
> again the portion depends on the total number of nodes.
>
> Its a bit more complicated though, when I say 'node is down' I mean one of 
> the nodes that the key would have been written to is down (the 3 or 5 above). 
> So if you had 10 nodes, rf 5, you could have 4 nodes down and the cluster be 
> available for all keys. So long as there are still 3 "natural endpoints" for 
> each key.
>
> Hope that helps.
>
> Aaron
>
> On 15/01/2011, at 8:52 AM, Mark Moseley <moseleym...@gmail.com> wrote:
>
>>> Perhaps the better question would be, if I have a two node cluster and
>>> I want to be able to lose one box completely and replace it (without
>>> losing the cluster), what settings would I need? Or is that an
>>> impossible scenario? In production, I'd imagine a 3 node cluster being
>>> the minimum but even there I could see each box having a full replica,
>>> but probably not beyond 3.
>>
>> Or perhaps, in the case of losing a box completely in a 2-node RF=2
>> cluster, do I need to lower the replication_factor on the still-alive
>> box, bootstrap the replaced node back in, and then change the
>> replication_factor=2?
>

Excellent, thanks! I'll definitely be checking those out.  I just want
to make sure I've got the hang of DR before we start deploying
Cassandra, and I'd hate to figure all this out later on with angry
customers standing over my shoulder :)

Reply via email to