Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Jake Luciani Tue, 07 Dec 2010 05:14:56 -0800

Max this was a bug fixed recently in 0.7 branch

https://issues.apache.org/jira/browse/CASSANDRA-1801


fixed now in RC2

-Jake

On Tue, Dec 7, 2010 at 8:11 AM, Max <cassan...@ajowa.de> wrote:

> As far as i can see, Lucandra already uses batch_mutations.
>
> https://github.com/tjake/Lucandra/blob/master/src/lucandra/IndexWriter.java#L263
>
> https://github.com/tjake/Lucandra/blob/master/src/lucandra/CassandraUtils.java#L371
>
> IndexWriter.addDocument() merges all fields to a mutioation map.
> In addition instead of "autoCommit" (commit each doc), i commit only every
> 10 documents. Where can i monitor incoming requests to cassandra?
> WriteCount and MutationCount (monitored by jconsole) didn't change
> obviously.
>
> I had problems to open the jrockit heapdump with MAT, but found "jrockit
> mission control" instead. Unfortunately i'm not confident using it.
>
> Here my observations:
> While heapByteBuffer was growing (~200mb) and flushed during client insert
> the byte[] was growing permanetly.
> http://oi51.tinypic.com/2uhbdp3.jpg
>
> I used TypeGraph to analyze the byte[] but i'm not sure how to interpret:
> http://oi53.tinypic.com/y2d1i.jpg
>
> Thank you!
> Max
>
>
> Aaron Morton <aa...@thelastpickle.com> wrote:
>
>> Jake or anyone else got experience bulk loading into Lucandra ?
>>
>> Or does anyone have experience with JRocket ?
>>
>> Max, are you sending one document at a time into lucene. Can you  send
>> them in batches (like solr), if so does it reduce the
>> amount of requests going to cassandra?
>>
>> Also, cassandra.bat is configured  with XX:+HeapDumpOnOutOfMemoryError so
>> you should be able to take a  look at where all the memory if going. Riptano
>> blog points  to http://www.eclipse.org/mat/  also  see
>> http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr
>>
>> Hope that helps.
>>
>> Aaron
>>
>> On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aa...@thelastpickle.com>
>> wrote:
>>
>> Accidentally sent to me.
>>
>> Begin forwarded message:
>> From: Max <cassan...@ajowa.de>
>> Date: 07 December 2010 6:00:36 AM
>> To: Aaron Morton <aa...@thelastpickle.com>
>> Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
>>
>> Thank you both for your answer!
>> After several tests with different parameters we came to the conclusion
>> that it must be a bug.
>> It looks very similar to:
>> https://issues.apache.org/jira/browse/CASSANDRA-1014
>>
>> For both CFs we reduced thresholds:
>> - memtable_flush_after_mins = 60 (both CFs are used permanently,
>> therefore other thresholds should trigger first)
>> - memtable_throughput_in_mb = 40
>> - memtable_operations_in_millions = 0.3
>> - keys_cached = 0
>> - rows_cached = 0
>>
>> - in_memory_compaction_limit_in_mb = 64
>>
>> First we disabled caching, later we disabled compacting and after that we
>> set
>> commitlog_sync: batch
>> commitlog_sync_batch_window_in_ms: 1
>>
>> But our problem still appears:
>> During inserting files with Lucandra memory usage is slowly growing
>> until OOM crash after about 50 min.
>> @Peter: In our latest test we stopped writing suddenly but cassandra
>> didn\'t relax and remains even after minutes on ~90% heap usage.
>> http://oi54.tinypic.com/2dueeix.jpg
>>
>> With our heap calculation we should need:
>> 64 MB * 2 * 3 + 1 GB = 1,4 GB
>> All recent tests we run with 3 GB. I think that should be ok for a test
>> machine.
>> Also consistency level is one.
>>
>> But Aaron is right, Lucandra produces even more than 200 inserts/s.
>> My 200 documents per second are about 200 operations (writecount) on
>> first CF and about 3000 on second CF.
>>
>> But even with about 120 documents/s cassandra crashes.
>>
>>
>> Disk I/O monitored with Windows performance admin tools is on both
>> discs moderate (commitlog is on seperate harddisc).
>>
>>
>> Any ideas?
>> If it's really a bug, in my opinion it's very critical.
>>
>>
>>
>> Aaron Morton <aa...@thelastpickle.com> wrote:
>>
>>  I remember you have 2 CF's but what are the settings for:
>>>
>>> - memtable_flush_after_mins
>>> - memtable_throughput_in_mb
>>> - memtable_operations_in_millions
>>> - keys_cached
>>> - rows_cached
>>>
>>> - in_memory_compaction_limit_in_mb
>>>
>>> Can you do the JVM Heap Calculation here and see what it says
>>> http://wiki.apache.org/cassandra/MemtableThresholds
>>>
>>> What Consistency Level are you writing at? (Checking  it's not Zero)
>>>
>>> When you talk about 200 inserts per second is that storing 200  documents
>>> through lucandra or 200 request to cassandra. If it's the  first option I
>>> would assume that would generate a lot more actual  requests into cassandra.
>>> Open up jconsole and take a look at the  WriteCount settings for the  CF's
>>> http://wikiapache.org/cassandra/MemtableThresholds
>>>
>>>
>>> You could also try setting the compaction thresholds to 0 to disable
>>> compaction while you are pushing this data in. Then use node tool to
>>> compact and turn the settings back to normal. See cassandra.yam for
>>> more info.
>>>
>>> I would have thought you could get the writes through with the setup
>>> you've described so far (even though a single 32bit node is unusual).
>>> The best advice is to turn all the settings down (e.g. caches off,
>>> mtable flush 64MB, compaction disabled) and if it still fails try:
>>>
>>> - checking your IO stats, not sure on windows but JConsole has some IO
>>> stats. If your IO cannot keep up then your server is not fast enough
>>> for your client load.
>>> - reducing the client load
>>>
>>> Hope that helps.
>>> Aaron
>>>
>>>
>>> On 04 Dec, 2010,at 05:23 AM, Max <cassan...@ajowa.de> wrote:
>>>
>>> Hi,
>>>
>>> we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
>>> 4 GB RAM)
>>> but under "heavy" inserts Cassandra is still crashing with OutOfMemory
>>> error after a GC storm.
>>>
>>> It sounds very similar to
>>> https://issues.apache.org/jira/browse/CASSANDRA-1177
>>>
>>> In our insert-tests the average heap usage is slowly growing up to the
>>> 3 GB border (jconsole monitor over 50 min
>>> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
>>> also constantly growing up to about 50 jobs pending
>>>
>>> We tried to decrease CF memtable threshold but after about half a
>>> million inserts it's over.
>>>
>>> - Cassandra 0.7.0 beta 3
>>> - Single Node
>>> - about 200 inserts/s ~500byte - 1 kb
>>>
>>>
>>> Is there no other possibility instead of slowing down inserts/s ?
>>>
>>> What could be an indicator to see if a node works stable with this
>>> amount of inserts?
>>>
>>> Thank you for your answer,
>>> Max
>>>
>>>
>>> Aaron Morton <aa...@thelastpickle.com>:
>>>
>>>  Sounds like you need to increase the Heap size and/or reduce the
>>>>  memtable_throughput_in_mb and/or turn off the internal caches.  Normally
>>>> the binary memtable thresholds only apply to bulk load  operations and it's
>>>> the per CF memtable_* settings you want to  change. I'm not familiar with
>>>> lucandra though.
>>>>
>>>> See the section on JVM Heap Size here
>>>> http://wiki.apache.org/cassandra/MemtableThresholds
>>>>
>>>> Bottom line is you will need more JVM heap memory.
>>>>
>>>> Hope that helps.
>>>> Aaron
>>>>
>>>> On 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:
>>>>
>>>> Hi community,
>>>>
>>>> during my tests i had several OOM crashes.
>>>> Getting some hints to find out the problem would be nice.
>>>>
>>>> First cassandra crashes after about 45 min insert test script.
>>>> During the following tests time to OOM was shorter until it  started to
>>>> crash
>>>> even in "idle" mode.
>>>>
>>>> Here the facts:
>>>> - cassandra 0.7 beta 3
>>>> - using lucandra to index about 3 million files ~1kb data
>>>> - inserting with one client to one cassandra node with about 200 files/s
>>>> - cassandra data files for this keyspace grow up to about 20 GB
>>>> - the keyspace only contains the two lucandra specific CFs
>>>>
>>>> Cluster:
>>>> - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
>>>> - java jre 1.6.0_22
>>>> - heap space first 1GB, later increased to 1,3 GB
>>>>
>>>> Cassandra.yaml:
>>>> default + reduced "binary_memtable_throughput_in_mb" to 128
>>>>
>>>> CFs:
>>>> default + reduced
>>>> min_compaction_threshold: 4
>>>> max_compaction_threshold: 8
>>>>
>>>>
>>>> I think the problem appears always during compaction,
>>>> and perhaps it is a result of large rows (some about 170mb).
>>>>
>>>> Are there more options we could use to work with few memory?
>>>>
>>>> Is it a problem of compaction?
>>>> And how to avoid?
>>>> Slower inserts? More memory?
>>>> Even fewer memtable_throuput or in_memory_compaction_limit?
>>>> Continuous manual major comapction?
>>>>
>>>> I've read
>>>>
>>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>>>> - row_size should be fixed since 0.7 and 200mb is still far away from
>>>> 2gb
>>>> - only key cache is used a little bit 3600/20000
>>>> - after a lot of writes cassandra crashes even in idle mode
>>>> - memtablesize was reduced and there are only 2 CFs
>>>>
>>>> Several heapdumps in MAT show 60-99% heapusage of compaction thread.
>>>>
>>>

Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Reply via email to