Re: Cassandra shuts down; was:Cassandra crashes

2013-09-04 Thread Nate McCall
Ideally, you should get back pressure in the form of dropped messages
before you see crashes, but if turning down the heap allocation was the
only thing you did, there are other changes required (several mentioned by
Romain above are very good places to start).
A few other ideas:
- did you adjust ParNew along with heap?
- you may want to adjust SurvivorRatio and MaxTenuringThreshold (change
both to 4 as a starting point) in JVM
- definitely play with compaction throughput by turning it way up since you
have IO capacity

These will cause you to GC and compact continuously in this environment,
but it should at least keep going


On Wed, Sep 4, 2013 at 9:14 AM, Romain HARDOUIN
wrote:

> Have you tried to tweak settings like memtable_total_space_in_mb and
> flush_largest_memtables_at?
> Also, the compaction manager seems to be pretty busy, take a look at
> in_memory_compaction_limit_in_mb.
> And with SSD hardware you should modifiy multithreaded_compaction,
> compaction_throughput_mb_per_sec, concurrent_reads and concurrent_writes.
> Of course 2GB of RAM is low but tweak these settings might help you.
> Maybe some guru could confirm/infirm that.
>
>
>
>
> De :Jan Algermissen 
> A :user@cassandra.apache.org,
> Date :    04/09/2013 12:29
> Objet :    Re: Cassandra shuts down; was:Cassandra crashes
> --
>
>
>
> Romain,
>
>
> On 04.09.2013, at 11:11, Romain HARDOUIN 
> wrote:
>
> > Maybe you should include the end of Cassandra logs.
>
> There is nothing that seems interesting in cassandra.log. Below you find
> system.log.
>
> > What comes to my mind when I read your first post is OOM killer.
> > But what you describe later is not the case.
> > Just to be sure, have you checked /var/log/messages?
>
> Nothing there, just occasional Firewall TCP rejections.
>
> Somehow I think I am simply overloading the whole cluster (see the hinted
> handoff messages in the log). Could that be due to the limited memory of
> 2GB my nodes have? IOW, not enough space to buffer up the writes before
> dumping to disk?
>
> Also, my overall write performance is actually pretty bad compared to what
> I read about C*. Before I thought it was the client doing to much work or
> the network. Turns out that's not the case.
>
> I'd expect C* to sort of just suck in my rather small amount of data -
> must be me, not using the right configuration. Oh well, I'll get there :-)
> Thanks anyhow.
>
> Jan
>
>
>
>
>


Re: Cassandra shuts down; was:Cassandra crashes

2013-09-04 Thread Romain HARDOUIN
Have you tried to tweak settings like memtable_total_space_in_mb and 
flush_largest_memtables_at?
Also, the compaction manager seems to be pretty busy, take a look at 
in_memory_compaction_limit_in_mb.
And with SSD hardware you should modifiy multithreaded_compaction, 
compaction_throughput_mb_per_sec, concurrent_reads and concurrent_writes.
Of course 2GB of RAM is low but tweak these settings might help you.
Maybe some guru could confirm/infirm that.




De :Jan Algermissen 
A : user@cassandra.apache.org, 
Date :  04/09/2013 12:29
Objet : Re: Cassandra shuts down; was:Cassandra crashes



Romain,


On 04.09.2013, at 11:11, Romain HARDOUIN  
wrote:

> Maybe you should include the end of Cassandra logs. 

There is nothing that seems interesting in cassandra.log. Below you find 
system.log.

> What comes to my mind when I read your first post is OOM killer. 
> But what you describe later is not the case. 
> Just to be sure, have you checked /var/log/messages? 

Nothing there, just occasional Firewall TCP rejections. 

Somehow I think I am simply overloading the whole cluster (see the hinted 
handoff messages in the log). Could that be due to the limited memory of 
2GB my nodes have? IOW, not enough space to buffer up the writes before 
dumping to disk?

Also, my overall write performance is actually pretty bad compared to what 
I read about C*. Before I thought it was the client doing to much work or 
the network. Turns out that's not the case.

I'd expect C* to sort of just suck in my rather small amount of data - 
must be me, not using the right configuration. Oh well, I'll get there :-) 
Thanks anyhow.

Jan






Re: Cassandra shuts down; was:Cassandra crashes

2013-09-04 Thread Jan Algermissen
Romain,


On 04.09.2013, at 11:11, Romain HARDOUIN  wrote:

> Maybe you should include the end of Cassandra logs. 

There is nothing that seems interesting in cassandra.log. Below you find 
system.log.

> What comes to my mind when I read your first post is OOM killer. 
> But what you describe later is not the case. 
> Just to be sure, have you checked /var/log/messages? 

Nothing there, just occasional Firewall TCP rejections. 

Somehow I think I am simply overloading the whole cluster (see the hinted 
handoff messages in the log). Could that be due to the limited memory of 2GB my 
nodes have? IOW, not enough space to buffer up the writes before dumping to 
disk?

Also, my overall write performance is actually pretty bad compared to what I 
read about C*. Before I thought it was the client doing to much work or the 
network. Turns out that's not the case.

I'd expect C* to sort of just suck in my rather small amount of data - must be 
me, not using the right configuration. Oh well, I'll get there :-) Thanks 
anyhow.

Jan

> 
> Romain 
> 




INFO [ScheduledTasks:1] 2013-09-04 07:17:09,057 StatusLogger.java (line 96) 
KeyCache216  936
  all 
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,057 StatusLogger.java (line 102) 
RowCache  00
  all  org.apache.cassandra.cache.SerializingCacheProvider
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,058 StatusLogger.java (line 109) 
ColumnFamilyMemtable ops,data
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,082 StatusLogger.java (line 112) 
system.local 4,52
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,083 StatusLogger.java (line 112) 
system.peers  0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,083 StatusLogger.java (line 112) 
system.batchlog   0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,083 StatusLogger.java (line 112) 
system.NodeIdInfo 0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,083 StatusLogger.java (line 112) 
system.LocationInfo   0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,084 StatusLogger.java (line 112) 
system.Schema 0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,084 StatusLogger.java (line 112) 
system.Migrations 0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,084 StatusLogger.java (line 112) 
system.schema_keyspaces   0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,084 StatusLogger.java (line 112) 
system.schema_columns 0,0
ERROR [FlushWriter:6] 2013-09-04 07:17:09,210 CassandraDaemon.java (line 192) 
Exception in thread Thread[FlushWriter:6,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.util.FastByteArrayOutputStream.expand(FastByteArrayOutputStream.java:104)
at 
org.apache.cassandra.io.util.FastByteArrayOutputStream.write(FastByteArrayOutputStream.java:220)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.cassandra.io.util.DataOutputBuffer.write(DataOutputBuffer.java:60)
at 
org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at 
org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at 
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
at 
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at 
org.apache.cassandra.db.OnDiskAtom$Serializer.serializeForSSTable(OnDiskAtom.java:62)
at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:181)
at 
org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:185)
at 
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:489)
at 
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:448)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,210 StatusLogger.java (line 112) 
system.schema_columnfamilies 0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,218 StatusLogger.java (line 112) 
system.IndexInfo  0,0
 INFO [ScheduledTasks:1] 2013-09-04 07:17:09,218 StatusLogger.java (line 112) 
system.range_xfers 

Re: Cassandra shuts down; was:Cassandra crashes

2013-09-04 Thread Romain HARDOUIN
Maybe you should include the end of Cassandra logs.
What comes to my mind when I read your first post is OOM killer. 
But what you describe later is not the case.
Just to be sure, have you checked /var/log/messages?

Romain



De :Jan Algermissen 
A : user@cassandra.apache.org, 
Date :  04/09/2013 10:52
Objet : Re: Cassandra shuts down; was:Cassandra crashes



The subject line isn't appropriate - the servers do not crash but shut 
down. Since the log messages appear several lines before the end of the 
log file, I only saw afterwards. Excuse the confusion.

Jan


On 04.09.2013, at 10:44, Jan Algermissen  
wrote:

> Hi,
> 
> I have set up C* in a very limited environment: 3 VMs at digitalocean 
with 2GB RAM and 40GB SSDs, so my expectations about overall performance 
are low.
> 
> Keyspace uses replication level of 2.
> 
> I am loading 1.5 Mio rows (each 60 columns of a mix of numbers and small 
texts, 300.000 wide rows effektively) in a quite 'agressive' way, using 
java-driver and async update statements.
> 
> After a while of importing data, I start seeing timeouts reported by the 
driver:
> 
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra 
timeout during write query at consistency ONE (1 replica were required but 
only 0 acknowledged the write
> 
> and then later, host-unavailability exceptions:
> 
> com.datastax.driver.core.exceptions.UnavailableException: Not enough 
replica available for query at consistency ONE (1 required but only 0 
alive).
> 
> Looking at the 3 hosts, I see two C*s went down - which explains that I 
still see some writes succeeding (that must be the one host left, 
satisfying the consitency level ONE).
> 
> 
> The logs tell me AFAIU that the servers shutdown due to reaching the 
heap size limit.
> 
> I am irritated by the fact that the instances (it seems) shut themselves 
down instead of limiting their amount of work. I understand that I need to 
tweak the configuration and likely get more RAM, but still, I would 
actually be satisfied with reduced service (and likely more timeouts in 
the client).  Right now it looks as if I would have to slow down the 
client 'artificially'  to prevent the loss of hosts - does that make 
sense?
> 
> Can anyone explain whether this is intended behavior, meaning I'll just 
have to accept the self-shutdown of the hosts? Or alternatively, what data 
I should collect to investigate the cause further?
> 
> Jan
> 
> 
> 
> 
> 




Re: Cassandra shuts down; was:Cassandra crashes

2013-09-04 Thread Jan Algermissen
The subject line isn't appropriate - the servers do not crash but shut down. 
Since the log messages appear several lines before the end of the log file, I 
only saw afterwards. Excuse the confusion.

Jan


On 04.09.2013, at 10:44, Jan Algermissen  wrote:

> Hi,
> 
> I have set up C* in a very limited environment: 3 VMs at digitalocean with 
> 2GB RAM and 40GB SSDs, so my expectations about overall performance are low.
> 
> Keyspace uses replication level of 2.
> 
> I am loading 1.5 Mio rows (each 60 columns of a mix of numbers and small 
> texts, 300.000 wide rows effektively) in a quite 'agressive' way, using 
> java-driver and async update statements.
> 
> After a while of importing data, I start seeing timeouts reported by the 
> driver:
> 
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout 
> during write query at consistency ONE (1 replica were required but only 0 
> acknowledged the write
> 
> and then later, host-unavailability exceptions:
> 
> com.datastax.driver.core.exceptions.UnavailableException: Not enough replica 
> available for query at consistency ONE (1 required but only 0 alive).
> 
> Looking at the 3 hosts, I see two C*s went down - which explains that I still 
> see some writes succeeding (that must be the one host left, satisfying the 
> consitency level ONE).
> 
> 
> The logs tell me AFAIU that the servers shutdown due to reaching the heap 
> size limit.
> 
> I am irritated by the fact that the instances (it seems) shut themselves down 
> instead of limiting their amount of work. I understand that I need to tweak 
> the configuration and likely get more RAM, but still, I would actually be 
> satisfied with reduced service (and likely more timeouts in the client).  
> Right now it looks as if I would have to slow down the client 'artificially'  
> to prevent the loss of hosts - does that make sense?
> 
> Can anyone explain whether this is intended behavior, meaning I'll just have 
> to accept the self-shutdown of the hosts? Or alternatively, what data I 
> should collect to investigate the cause further?
> 
> Jan
> 
> 
> 
> 
>