It sounds like you're simply throwing too much load at Cassandra. Adding more machines can help. Look at http://wiki.apache.org/cassandra/Operations for how to track metrics that will tell you how much is "too much."
Telling us more about your workload would be useful in sanity checking that hypothesis. :) -Jonathan On Fri, Dec 18, 2009 at 4:34 PM, Brian Burruss <bburr...@real.com> wrote: > this time i simulated node 1 crashing, waited a few minutes, then restarted > it. after a while node 2 OOM'ed. > > same 2 node cluster with RF=2, W=1, R=1. i up'ed the RAM to 6G this time. > > cluster contains ~126,281,657 data elements containing about 298G on one > node's disk > > thx!