Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Max Tue, 07 Dec 2010 09:16:18 -0800

Thank you Jake, also Aaron & Peter for your help :-)


It was the 1801 bug, solved in RC2 SVN Snapshot!

Max


Jake Luciani <jak...@gmail.com> wrote:

Max this was a bug fixed recently in 0.7 branch

https://issues.apache.org/jira/browse/CASSANDRA-1801

fixed now in RC2

-Jake

On Tue, Dec 7, 2010 at 8:11 AM, Max <cassan...@ajowa.de> wrote:

As far as i can see, Lucandra already uses batch_mutations.

https://github.com/tjake/Lucandra/blob/master/src/lucandra/IndexWriter.java#L263

https://github.com/tjake/Lucandra/blob/master/src/lucandra/CassandraUtils.java#L371

IndexWriter.addDocument() merges all fields to a mutioation map.
In addition instead of "autoCommit" (commit each doc), i commit only every
10 documents. Where can i monitor incoming requests to cassandra?
WriteCount and MutationCount (monitored by jconsole) didn't change
obviously.

I had problems to open the jrockit heapdump with MAT, but found "jrockit
mission control" instead. Unfortunately i'm not confident using it.

Here my observations:
While heapByteBuffer was growing (~200mb) and flushed during client insert
the byte[] was growing permanetly.
http://oi51.tinypic.com/2uhbdp3.jpg

I used TypeGraph to analyze the byte[] but i'm not sure how to interpret:
http://oi53.tinypic.com/y2d1i.jpg

Thank you!
Max


Aaron Morton <aa...@thelastpickle.com> wrote:

Jake or anyone else got experience bulk loading into Lucandra ?

Or does anyone have experience with JRocket ?

Max, are you sending one document at a time into lucene. Can you  send
them in batches (like solr), if so does it reduce the
amount of requests going to cassandra?

Also, cassandra.bat is configured  with XX:+HeapDumpOnOutOfMemoryError so

you should be able to take a look at where all the memory ifgoing. Riptano

blog points  to http://www.eclipse.org/mat/  also  see
http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr

Hope that helps.

Aaron

On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aa...@thelastpickle.com>
wrote:

Accidentally sent to me.

Begin forwarded message:
From: Max <cassan...@ajowa.de>
Date: 07 December 2010 6:00:36 AM
To: Aaron Morton <aa...@thelastpickle.com>
Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Thank you both for your answer!
After several tests with different parameters we came to the conclusion
that it must be a bug.
It looks very similar to:
https://issues.apache.org/jira/browse/CASSANDRA-1014

For both CFs we reduced thresholds:
- memtable_flush_after_mins = 60 (both CFs are used permanently,
therefore other thresholds should trigger first)
- memtable_throughput_in_mb = 40
- memtable_operations_in_millions = 0.3
- keys_cached = 0
- rows_cached = 0

- in_memory_compaction_limit_in_mb = 64

First we disabled caching, later we disabled compacting and after that we
set
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 1

But our problem still appears:
During inserting files with Lucandra memory usage is slowly growing
until OOM crash after about 50 min.
@Peter: In our latest test we stopped writing suddenly but cassandra
didn\'t relax and remains even after minutes on ~90% heap usage.
http://oi54.tinypic.com/2dueeix.jpg

With our heap calculation we should need:
64 MB * 2 * 3 + 1 GB = 1,4 GB
All recent tests we run with 3 GB. I think that should be ok for a test
machine.
Also consistency level is one.

But Aaron is right, Lucandra produces even more than 200 inserts/s.
My 200 documents per second are about 200 operations (writecount) on
first CF and about 3000 on second CF.

But even with about 120 documents/s cassandra crashes.


Disk I/O monitored with Windows performance admin tools is on both
discs moderate (commitlog is on seperate harddisc).


Any ideas?
If it's really a bug, in my opinion it's very critical.



Aaron Morton <aa...@thelastpickle.com> wrote:

 I remember you have 2 CF's but what are the settings for:


- memtable_flush_after_mins
- memtable_throughput_in_mb
- memtable_operations_in_millions
- keys_cached
- rows_cached

- in_memory_compaction_limit_in_mb

Can you do the JVM Heap Calculation here and see what it says
http://wiki.apache.org/cassandra/MemtableThresholds

What Consistency Level are you writing at? (Checking  it's not Zero)

When you talk about 200 inserts per second is that storing 200  documents
through lucandra or 200 request to cassandra. If it's the  first option I

would assume that would generate a lot more actual requests intocassandra.

Open up jconsole and take a look at the  WriteCount settings for the  CF's
http://wikiapache.org/cassandra/MemtableThresholds


You could also try setting the compaction thresholds to 0 to disable
compaction while you are pushing this data in. Then use node tool to
compact and turn the settings back to normal. See cassandra.yam for
more info.

I would have thought you could get the writes through with the setup
you've described so far (even though a single 32bit node is unusual).
The best advice is to turn all the settings down (e.g. caches off,
mtable flush 64MB, compaction disabled) and if it still fails try:

- checking your IO stats, not sure on windows but JConsole has some IO
stats. If your IO cannot keep up then your server is not fast enough
for your client load.
- reducing the client load

Hope that helps.
Aaron


On 04 Dec, 2010,at 05:23 AM, Max <cassan...@ajowa.de> wrote:

Hi,

we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
4 GB RAM)
but under "heavy" inserts Cassandra is still crashing with OutOfMemory
error after a GC storm.

It sounds very similar to
https://issues.apache.org/jira/browse/CASSANDRA-1177

In our insert-tests the average heap usage is slowly growing up to the
3 GB border (jconsole monitor over 50 min
http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
also constantly growing up to about 50 jobs pending

We tried to decrease CF memtable threshold but after about half a
million inserts it's over.

- Cassandra 0.7.0 beta 3
- Single Node
- about 200 inserts/s ~500byte - 1 kb


Is there no other possibility instead of slowing down inserts/s ?

What could be an indicator to see if a node works stable with this
amount of inserts?

Thank you for your answer,
Max


Aaron Morton <aa...@thelastpickle.com>:

 Sounds like you need to increase the Heap size and/or reduce the

 memtable_throughput_in_mb and/or turn off the internal caches.  Normally

the binary memtable thresholds only apply to bulk loadoperations and it's

the per CF memtable_* settings you want to  change. I'm not familiar with
lucandra though.

See the section on JVM Heap Size here
http://wiki.apache.org/cassandra/MemtableThresholds

Bottom line is you will need more JVM heap memory.

Hope that helps.
Aaron

On 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:

Hi community,

during my tests i had several OOM crashes.
Getting some hints to find out the problem would be nice.

First cassandra crashes after about 45 min insert test script.
During the following tests time to OOM was shorter until it  started to
crash
even in "idle" mode.

Here the facts:
- cassandra 0.7 beta 3
- using lucandra to index about 3 million files ~1kb data
- inserting with one client to one cassandra node with about 200 files/s
- cassandra data files for this keyspace grow up to about 20 GB
- the keyspace only contains the two lucandra specific CFs

Cluster:
- cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
- java jre 1.6.0_22
- heap space first 1GB, later increased to 1,3 GB

Cassandra.yaml:
default + reduced "binary_memtable_throughput_in_mb" to 128

CFs:
default + reduced
min_compaction_threshold: 4
max_compaction_threshold: 8


I think the problem appears always during compaction,
and perhaps it is a result of large rows (some about 170mb).

Are there more options we could use to work with few memory?

Is it a problem of compaction?
And how to avoid?
Slower inserts? More memory?
Even fewer memtable_throuput or in_memory_compaction_limit?
Continuous manual major comapction?

I've read

http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
- row_size should be fixed since 0.7 and 200mb is still far away from
2gb
- only key cache is used a little bit 3600/20000
- after a lot of writes cassandra crashes even in idle mode
- memtablesize was reduced and there are only 2 CFs

Several heapdumps in MAT show 60-99% heapusage of compaction thread.

Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

Reply via email to