Re: cascading failures due to memory

2011-06-15 Thread Sasha Dolgy
No.  Upgraded to 0.8 and monitor the systems more.  we schedule a repair
every 24hrs via cron and so far no problems..
On Jun 15, 2011 5:44 PM, "AJ"  wrote:
> Sasha,
>
> Did you ever nail down the cause of this problem?
>
> On 5/31/2011 4:01 AM, Sasha Dolgy wrote:
>> hi everyone,
>>
>> the current nodes i have deployed (4) have all been working fine, with
>> not a lot of data ... more reads than writes at the moment. as i had
>> monitoring disabled, when one node's OS killed the cassandra process
>> due to out of memory problems ... that was fine. 24 hours later,
>> another node, 24 hours later, another node ...until finally, all 4
>> nodes no longer had cassandra running.
>>
>> When all nodes are started fresh, CPU utilization is at about 21% on
>> each box. after 24 hours, this goes up to 32% and then 51% 24 hours
>> later.
>>
>> originally I had thought that this may be a result of 'nodetool
>> repair' not being run consistently ... after adding a cronjob to run
>> every 24 hours (staggered between nodes) the problem of the increasing
>> memory utilization does not resolve.
>>
>> i've read the operations page and also the
>> http://wiki.apache.org/cassandra/MemtableThresholds page. i am
>> running defaults and 0.7.6-02 ...
>>
>> what are the best places to start in terms of finding why this is
>> happening? CF design / usage? 'nodetool cfstats' gives me some good
>> info ... and i've already implemented some changes to one CF based on
>> how it had ballooned (too many rows versus not enough columns)
>>
>> suggestions appreciated
>>
>


Re: cascading failures due to memory

2011-06-15 Thread AJ

Sasha,

Did you ever nail down the cause of this problem?

On 5/31/2011 4:01 AM, Sasha Dolgy wrote:

hi everyone,

the current nodes i have deployed (4) have all been working fine, with
not a lot of data ... more reads than writes at the moment.  as i had
monitoring disabled, when one node's OS killed the cassandra process
due to out of memory problems ... that was fine.  24 hours later,
another node, 24 hours later, another node ...until finally, all 4
nodes no longer had cassandra running.

When all nodes are started fresh, CPU utilization is at about 21% on
each box.  after 24 hours, this goes up to 32% and then 51% 24 hours
later.

originally I had thought that this may be a result of 'nodetool
repair' not being run consistently ... after adding a cronjob to run
every 24 hours (staggered between nodes) the problem of the increasing
memory utilization does not resolve.

i've read the operations page and also the
http://wiki.apache.org/cassandra/MemtableThresholds page.  i am
running defaults and 0.7.6-02 ...

what are the best places to start in terms of finding why this is
happening?  CF design / usage?  'nodetool cfstats' gives me some good
info ... and i've already implemented some changes to one CF based on
how it had ballooned (too many rows versus not enough columns)

suggestions appreciated





Re: cascading failures due to memory

2011-06-01 Thread Jonathan Ellis
look for GCInspector

On Wed, Jun 1, 2011 at 2:30 PM, Sasha Dolgy  wrote:
> is there a specific string I should be looking for in the logs that
> isn't super obvious to me at the moment...
>
> On Tue, May 31, 2011 at 8:21 PM, Jonathan Ellis  wrote:
>> The place to start is with the statistics Cassandra logs after each GC.
>>
>> On Tue, May 31, 2011 at 5:01 AM, Sasha Dolgy  wrote:
>>> hi everyone,
>>>
>>> the current nodes i have deployed (4) have all been working fine, with
>>> not a lot of data ... more reads than writes at the moment.  as i had
>>> monitoring disabled, when one node's OS killed the cassandra process
>>> due to out of memory problems ... that was fine.  24 hours later,
>>> another node, 24 hours later, another node ...until finally, all 4
>>> nodes no longer had cassandra running.
>>>
>>> When all nodes are started fresh, CPU utilization is at about 21% on
>>> each box.  after 24 hours, this goes up to 32% and then 51% 24 hours
>>> later.
>>>
>>> originally I had thought that this may be a result of 'nodetool
>>> repair' not being run consistently ... after adding a cronjob to run
>>> every 24 hours (staggered between nodes) the problem of the increasing
>>> memory utilization does not resolve.
>>>
>>> i've read the operations page and also the
>>> http://wiki.apache.org/cassandra/MemtableThresholds page.  i am
>>> running defaults and 0.7.6-02 ...
>>>
>>> what are the best places to start in terms of finding why this is
>>> happening?  CF design / usage?  'nodetool cfstats' gives me some good
>>> info ... and i've already implemented some changes to one CF based on
>>> how it had ballooned (too many rows versus not enough columns)
>>>
>>> suggestions appreciated
>>>
>>> --
>>> Sasha Dolgy
>>> sasha.do...@gmail.com
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: cascading failures due to memory

2011-06-01 Thread Sasha Dolgy
and is there anything specific that could be causing the issue between
 Java SE 1.6.0_24 and 1.6.0_25 ?  All nodes are _24

up to 64% memory usage today

-sd

On Wed, Jun 1, 2011 at 9:30 PM, Sasha Dolgy  wrote:
> is there a specific string I should be looking for in the logs that
> isn't super obvious to me at the moment...
>
> On Tue, May 31, 2011 at 8:21 PM, Jonathan Ellis  wrote:
>> The place to start is with the statistics Cassandra logs after each GC.
>>
>> On Tue, May 31, 2011 at 5:01 AM, Sasha Dolgy  wrote:
>>> hi everyone,
>>>
>>> the current nodes i have deployed (4) have all been working fine, with
>>> not a lot of data ... more reads than writes at the moment.  as i had
>>> monitoring disabled, when one node's OS killed the cassandra process
>>> due to out of memory problems ... that was fine.  24 hours later,
>>> another node, 24 hours later, another node ...until finally, all 4
>>> nodes no longer had cassandra running.
>>>
>>> When all nodes are started fresh, CPU utilization is at about 21% on
>>> each box.  after 24 hours, this goes up to 32% and then 51% 24 hours
>>> later.
>>>
>>> originally I had thought that this may be a result of 'nodetool
>>> repair' not being run consistently ... after adding a cronjob to run
>>> every 24 hours (staggered between nodes) the problem of the increasing
>>> memory utilization does not resolve.
>>>
>>> i've read the operations page and also the
>>> http://wiki.apache.org/cassandra/MemtableThresholds page.  i am
>>> running defaults and 0.7.6-02 ...
>>>
>>> what are the best places to start in terms of finding why this is
>>> happening?  CF design / usage?  'nodetool cfstats' gives me some good
>>> info ... and i've already implemented some changes to one CF based on
>>> how it had ballooned (too many rows versus not enough columns)
>>>
>>> suggestions appreciated
>>>
>>> --
>>> Sasha Dolgy
>>> sasha.do...@gmail.com
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com
>



-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: cascading failures due to memory

2011-06-01 Thread Sasha Dolgy
is there a specific string I should be looking for in the logs that
isn't super obvious to me at the moment...

On Tue, May 31, 2011 at 8:21 PM, Jonathan Ellis  wrote:
> The place to start is with the statistics Cassandra logs after each GC.
>
> On Tue, May 31, 2011 at 5:01 AM, Sasha Dolgy  wrote:
>> hi everyone,
>>
>> the current nodes i have deployed (4) have all been working fine, with
>> not a lot of data ... more reads than writes at the moment.  as i had
>> monitoring disabled, when one node's OS killed the cassandra process
>> due to out of memory problems ... that was fine.  24 hours later,
>> another node, 24 hours later, another node ...until finally, all 4
>> nodes no longer had cassandra running.
>>
>> When all nodes are started fresh, CPU utilization is at about 21% on
>> each box.  after 24 hours, this goes up to 32% and then 51% 24 hours
>> later.
>>
>> originally I had thought that this may be a result of 'nodetool
>> repair' not being run consistently ... after adding a cronjob to run
>> every 24 hours (staggered between nodes) the problem of the increasing
>> memory utilization does not resolve.
>>
>> i've read the operations page and also the
>> http://wiki.apache.org/cassandra/MemtableThresholds page.  i am
>> running defaults and 0.7.6-02 ...
>>
>> what are the best places to start in terms of finding why this is
>> happening?  CF design / usage?  'nodetool cfstats' gives me some good
>> info ... and i've already implemented some changes to one CF based on
>> how it had ballooned (too many rows versus not enough columns)
>>
>> suggestions appreciated
>>
>> --
>> Sasha Dolgy
>> sasha.do...@gmail.com
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: cascading failures due to memory

2011-05-31 Thread Jonathan Ellis
The place to start is with the statistics Cassandra logs after each GC.

On Tue, May 31, 2011 at 5:01 AM, Sasha Dolgy  wrote:
> hi everyone,
>
> the current nodes i have deployed (4) have all been working fine, with
> not a lot of data ... more reads than writes at the moment.  as i had
> monitoring disabled, when one node's OS killed the cassandra process
> due to out of memory problems ... that was fine.  24 hours later,
> another node, 24 hours later, another node ...until finally, all 4
> nodes no longer had cassandra running.
>
> When all nodes are started fresh, CPU utilization is at about 21% on
> each box.  after 24 hours, this goes up to 32% and then 51% 24 hours
> later.
>
> originally I had thought that this may be a result of 'nodetool
> repair' not being run consistently ... after adding a cronjob to run
> every 24 hours (staggered between nodes) the problem of the increasing
> memory utilization does not resolve.
>
> i've read the operations page and also the
> http://wiki.apache.org/cassandra/MemtableThresholds page.  i am
> running defaults and 0.7.6-02 ...
>
> what are the best places to start in terms of finding why this is
> happening?  CF design / usage?  'nodetool cfstats' gives me some good
> info ... and i've already implemented some changes to one CF based on
> how it had ballooned (too many rows versus not enough columns)
>
> suggestions appreciated
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com