Max this was a bug fixed recently in 0.7 branch https://issues.apache.org/jira/browse/CASSANDRA-1801
fixed now in RC2 -Jake On Tue, Dec 7, 2010 at 8:11 AM, Max <cassan...@ajowa.de> wrote: > As far as i can see, Lucandra already uses batch_mutations. > > https://github.com/tjake/Lucandra/blob/master/src/lucandra/IndexWriter.java#L263 > > https://github.com/tjake/Lucandra/blob/master/src/lucandra/CassandraUtils.java#L371 > > IndexWriter.addDocument() merges all fields to a mutioation map. > In addition instead of "autoCommit" (commit each doc), i commit only every > 10 documents. Where can i monitor incoming requests to cassandra? > WriteCount and MutationCount (monitored by jconsole) didn't change > obviously. > > I had problems to open the jrockit heapdump with MAT, but found "jrockit > mission control" instead. Unfortunately i'm not confident using it. > > Here my observations: > While heapByteBuffer was growing (~200mb) and flushed during client insert > the byte[] was growing permanetly. > http://oi51.tinypic.com/2uhbdp3.jpg > > I used TypeGraph to analyze the byte[] but i'm not sure how to interpret: > http://oi53.tinypic.com/y2d1i.jpg > > Thank you! > Max > > > Aaron Morton <aa...@thelastpickle.com> wrote: > >> Jake or anyone else got experience bulk loading into Lucandra ? >> >> Or does anyone have experience with JRocket ? >> >> Max, are you sending one document at a time into lucene. Can you send >> them in batches (like solr), if so does it reduce the >> amount of requests going to cassandra? >> >> Also, cassandra.bat is configured with XX:+HeapDumpOnOutOfMemoryError so >> you should be able to take a look at where all the memory if going. Riptano >> blog points to http://www.eclipse.org/mat/ also see >> http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr >> >> Hope that helps. >> >> Aaron >> >> On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aa...@thelastpickle.com> >> wrote: >> >> Accidentally sent to me. >> >> Begin forwarded message: >> From: Max <cassan...@ajowa.de> >> Date: 07 December 2010 6:00:36 AM >> To: Aaron Morton <aa...@thelastpickle.com> >> Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM) >> >> Thank you both for your answer! >> After several tests with different parameters we came to the conclusion >> that it must be a bug. >> It looks very similar to: >> https://issues.apache.org/jira/browse/CASSANDRA-1014 >> >> For both CFs we reduced thresholds: >> - memtable_flush_after_mins = 60 (both CFs are used permanently, >> therefore other thresholds should trigger first) >> - memtable_throughput_in_mb = 40 >> - memtable_operations_in_millions = 0.3 >> - keys_cached = 0 >> - rows_cached = 0 >> >> - in_memory_compaction_limit_in_mb = 64 >> >> First we disabled caching, later we disabled compacting and after that we >> set >> commitlog_sync: batch >> commitlog_sync_batch_window_in_ms: 1 >> >> But our problem still appears: >> During inserting files with Lucandra memory usage is slowly growing >> until OOM crash after about 50 min. >> @Peter: In our latest test we stopped writing suddenly but cassandra >> didn\'t relax and remains even after minutes on ~90% heap usage. >> http://oi54.tinypic.com/2dueeix.jpg >> >> With our heap calculation we should need: >> 64 MB * 2 * 3 + 1 GB = 1,4 GB >> All recent tests we run with 3 GB. I think that should be ok for a test >> machine. >> Also consistency level is one. >> >> But Aaron is right, Lucandra produces even more than 200 inserts/s. >> My 200 documents per second are about 200 operations (writecount) on >> first CF and about 3000 on second CF. >> >> But even with about 120 documents/s cassandra crashes. >> >> >> Disk I/O monitored with Windows performance admin tools is on both >> discs moderate (commitlog is on seperate harddisc). >> >> >> Any ideas? >> If it's really a bug, in my opinion it's very critical. >> >> >> >> Aaron Morton <aa...@thelastpickle.com> wrote: >> >> I remember you have 2 CF's but what are the settings for: >>> >>> - memtable_flush_after_mins >>> - memtable_throughput_in_mb >>> - memtable_operations_in_millions >>> - keys_cached >>> - rows_cached >>> >>> - in_memory_compaction_limit_in_mb >>> >>> Can you do the JVM Heap Calculation here and see what it says >>> http://wiki.apache.org/cassandra/MemtableThresholds >>> >>> What Consistency Level are you writing at? (Checking it's not Zero) >>> >>> When you talk about 200 inserts per second is that storing 200 documents >>> through lucandra or 200 request to cassandra. If it's the first option I >>> would assume that would generate a lot more actual requests into cassandra. >>> Open up jconsole and take a look at the WriteCount settings for the CF's >>> http://wikiapache.org/cassandra/MemtableThresholds >>> >>> >>> You could also try setting the compaction thresholds to 0 to disable >>> compaction while you are pushing this data in. Then use node tool to >>> compact and turn the settings back to normal. See cassandra.yam for >>> more info. >>> >>> I would have thought you could get the writes through with the setup >>> you've described so far (even though a single 32bit node is unusual). >>> The best advice is to turn all the settings down (e.g. caches off, >>> mtable flush 64MB, compaction disabled) and if it still fails try: >>> >>> - checking your IO stats, not sure on windows but JConsole has some IO >>> stats. If your IO cannot keep up then your server is not fast enough >>> for your client load. >>> - reducing the client load >>> >>> Hope that helps. >>> Aaron >>> >>> >>> On 04 Dec, 2010,at 05:23 AM, Max <cassan...@ajowa.de> wrote: >>> >>> Hi, >>> >>> we increased heap space to 3 GB (with JRocket VM under 32-bit Win with >>> 4 GB RAM) >>> but under "heavy" inserts Cassandra is still crashing with OutOfMemory >>> error after a GC storm. >>> >>> It sounds very similar to >>> https://issues.apache.org/jira/browse/CASSANDRA-1177 >>> >>> In our insert-tests the average heap usage is slowly growing up to the >>> 3 GB border (jconsole monitor over 50 min >>> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is >>> also constantly growing up to about 50 jobs pending >>> >>> We tried to decrease CF memtable threshold but after about half a >>> million inserts it's over. >>> >>> - Cassandra 0.7.0 beta 3 >>> - Single Node >>> - about 200 inserts/s ~500byte - 1 kb >>> >>> >>> Is there no other possibility instead of slowing down inserts/s ? >>> >>> What could be an indicator to see if a node works stable with this >>> amount of inserts? >>> >>> Thank you for your answer, >>> Max >>> >>> >>> Aaron Morton <aa...@thelastpickle.com>: >>> >>> Sounds like you need to increase the Heap size and/or reduce the >>>> memtable_throughput_in_mb and/or turn off the internal caches. Normally >>>> the binary memtable thresholds only apply to bulk load operations and it's >>>> the per CF memtable_* settings you want to change. I'm not familiar with >>>> lucandra though. >>>> >>>> See the section on JVM Heap Size here >>>> http://wiki.apache.org/cassandra/MemtableThresholds >>>> >>>> Bottom line is you will need more JVM heap memory. >>>> >>>> Hope that helps. >>>> Aaron >>>> >>>> On 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote: >>>> >>>> Hi community, >>>> >>>> during my tests i had several OOM crashes. >>>> Getting some hints to find out the problem would be nice. >>>> >>>> First cassandra crashes after about 45 min insert test script. >>>> During the following tests time to OOM was shorter until it started to >>>> crash >>>> even in "idle" mode. >>>> >>>> Here the facts: >>>> - cassandra 0.7 beta 3 >>>> - using lucandra to index about 3 million files ~1kb data >>>> - inserting with one client to one cassandra node with about 200 files/s >>>> - cassandra data files for this keyspace grow up to about 20 GB >>>> - the keyspace only contains the two lucandra specific CFs >>>> >>>> Cluster: >>>> - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM >>>> - java jre 1.6.0_22 >>>> - heap space first 1GB, later increased to 1,3 GB >>>> >>>> Cassandra.yaml: >>>> default + reduced "binary_memtable_throughput_in_mb" to 128 >>>> >>>> CFs: >>>> default + reduced >>>> min_compaction_threshold: 4 >>>> max_compaction_threshold: 8 >>>> >>>> >>>> I think the problem appears always during compaction, >>>> and perhaps it is a result of large rows (some about 170mb). >>>> >>>> Are there more options we could use to work with few memory? >>>> >>>> Is it a problem of compaction? >>>> And how to avoid? >>>> Slower inserts? More memory? >>>> Even fewer memtable_throuput or in_memory_compaction_limit? >>>> Continuous manual major comapction? >>>> >>>> I've read >>>> >>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors >>>> - row_size should be fixed since 0.7 and 200mb is still far away from >>>> 2gb >>>> - only key cache is used a little bit 3600/20000 >>>> - after a lot of writes cassandra crashes even in idle mode >>>> - memtablesize was reduced and there are only 2 CFs >>>> >>>> Several heapdumps in MAT show 60-99% heapusage of compaction thread. >>>> >>>