Sorry about that.  We eventually found that one column family had some
large/corrupt data and causing OOM's

Luckily it was a pretty ephemeral data set and we were able to just
truncate it.  However, it was a guess based on some log messages about
reading a large number of tombstones on that column families.  I think we
should review this column family design so it doesn't generate so many
tombstones?  Could that be the cause?  What else would you recommend?

Thank you in advance.

On Fri, Oct 6, 2017 at 6:33 AM Brian Spindler <brian.spind...@gmail.com>
wrote:

> Hi guys, our cluster - around 18 nodes - just starting having nodes die
> and when restarting them they are dying with OOM.  How can we handle this?
>  I've tried adding a couple extra gigs on these machines to help but it's
> not.
>
> Help!
> -B
>
>

Reply via email to