date:20150710

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

Attaching the stack dump captured from the last OOM.

Kunal

On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal



ERROR [SharedPool-Worker-6] 2015-07-10 05:12:16,862 
JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) ~[na:1.8.0_45]
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_45]
at 
org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:137)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:97) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:61)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Memtable.put(Memtable.java:192) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:131)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.index.SecondaryIndexManager$StandardUpdater.insert(SecondaryIndexManager.java:791)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:444)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:418)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.utils.btree.BTree.build(BTree.java:116) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns.addAllWithSizeDelta(AtomicBTreeColumns.java:225)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Memtable.put(Memtable.java:210) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:389) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:352) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:54) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_45]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.7.jar:2.1.7]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
ERROR [CompactionExecutor:3] 2015-07-10 05:12:16,862 CassandraDaemon.java:223 - 
Exception in thread Thread[CompactionExecutor:3,1,main]
java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayDeque.doubleCapacity(ArrayDeque.java:157) 
~[na:1.8.0_45]
at

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

Forgot to mention: the data size is not that big - it's barely 10GB in all.

Kunal

On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal

Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

Hi,

I have a 2 node setup on Azure (east us region) running Ubuntu server
14.04LTS.
Both nodes have 8GB RAM.

One of the nodes (seed node) died with OOM - so, I am trying to add a
replacement node with same configuration.

The problem is this new node also keeps dying with OOM - I've restarted the
cassandra service like 8-10 times hoping that it would finish the
replication. But it didn't help.

The one node that is still up is happily chugging along.
All nodes have similar configuration - with libjna installed.

Cassandra is installed from datastax's debian repo - pkg: dsc21 version
2.1.7.
I started off with the default configuration - i.e. the default
cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
= 2GB)

But, that didn't help. So, I then tried to increase the heap to 4GB
manually and restarted. It still keeps crashing.

Any clue as to why it's happening?

Thanks,
Kunal

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

I'm new to cassandra
How do I find those out? - mainly, the partition params that you asked for.
Others, I think I can figure out.

We don't have any large objects/blobs in the column values - it's all
textual, date-time, numeric and uuid data.

We use cassandra to primarily store segmentation data - with segment type
as partition key. That is again divided into two separate column families;
but they have similar structure.

Columns per row can be fairly large - each segment type as the row key and
associated user ids and timestamp as column value.

Thanks,
Kunal

On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com wrote:

 What does your data and data model look like - partition size, rows per
 partition, number of columns per row, any large values/blobs in column
 values?

 You could run fine on an 8GB system, but only if your rows and partitions
 are reasonably small. Any large partitions could blow you away.

 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Jack Krupansky

You, and only you, are responsible for knowing your data and data model.

If columns per row or rows per partition can be large, then an 8GB system
is probably too small. But the real issue is that you need to keep your
partition size from getting too large.

Generally, an 8GB system is okay, but only for reasonably-sized partitions,
like under 10MB.


-- Jack Krupansky

On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 I'm new to cassandra
 How do I find those out? - mainly, the partition params that you asked
 for. Others, I think I can figure out.

 We don't have any large objects/blobs in the column values - it's all
 textual, date-time, numeric and uuid data.

 We use cassandra to primarily store segmentation data - with segment type
 as partition key. That is again divided into two separate column families;
 but they have similar structure.

 Columns per row can be fairly large - each segment type as the row key and
 associated user ids and timestamp as column value.

 Thanks,
 Kunal

 On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com wrote:

 What does your data and data model look like - partition size, rows per
 partition, number of columns per row, any large values/blobs in column
 values?

 You could run fine on an 8GB system, but only if your rows and partitions
 are reasonably small. Any large partitions could blow you away.

 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've
 restarted the cassandra service like 8-10 times hoping that it would 
 finish
 the replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21
 version 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Jack Krupansky

What does your data and data model look like - partition size, rows per
partition, number of columns per row, any large values/blobs in column
values?

You could run fine on an 8GB system, but only if your rows and partitions
are reasonably small. Any large partitions could blow you away.

-- Jack Krupansky

On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

Thanks for quick reply.

1. I don't know what are the thresholds that I should look for. So, to save
this back-and-forth, I'm attaching the cfstats output for the keyspace.

There is one table - daily_challenges - which shows compacted partition max
bytes as ~460M and another one - daily_guest_logins - which shows compacted
partition max bytes as ~36M.

Can that be a problem?
Here is the CQL schema for the daily_challenges column family:

CREATE TABLE app_10001.daily_challenges (
segment_type text,
date timestamp,
user_id int,
sess_id text,
data text,
deleted boolean,
PRIMARY KEY (segment_type, date, user_id, sess_id)
) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


2. I don't know - how do I check? As I mentioned, I just installed the
dsc21 update from datastax's debian repo (ver 2.1.7).

Really appreciate your help.

Thanks,
Kunal

On 10 July 2015 at 23:33, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out the
 issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data model is bad, you are going to have to re-design it no matter
 what.

 #2 As a possible workaround try using the G1GC allocator with the
 settings from c* 3.0 instead of CMS. I've seen lots of success with it
 lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely
 tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not*
 set the newgen size for G1 sets it dynamically:

 # min and max heap sizes should be set to the same value to avoid
 # stop-the-world GC pauses during resize, and so that we can lock the
 # heap in memory on startup to prevent any of it from being swapped
 # out.
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

 # Per-thread stack size.
 JVM_OPTS=$JVM_OPTS -Xss256k

 # Use the Hotspot garbage-first collector.
 JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

 # Have the JVM do less remembered set work during STW, instead
 # preferring concurrent GC. Reduces p99.9 latency.
 JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 # The JVM maximum is 8 PGC threads and 1/4

Re: Cassandra counters

2015-07-10 Thread Ajay

Any pointers on this?.

In 2.1, when updating the counter with UNLOGGED batch using timestamp isn't
safe as other column update with consistency level (with timestamp counter
update can be idempotent? ).

Thanks
Ajay

On 09-Jul-2015 11:47 am, Ajay ajay.ga...@gmail.com wrote:

 Hi,

 What is the accuracy improvement of counter in 2.1 over 2.0?

 This below post, it mentioned 2.0.x issues fixed in 2.1 and perfomance
improvement.

http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters

 But how accurate are the counter 2.1.x or any known issues in 2.1 using
UNLOGGED batch for counter update with timestamp?

 Thanks
 Ajay

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

And here is my cassandra-env.sh
https://gist.github.com/kunalg/2c092cb2450c62be9a20

Kunal

On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 From jhat output, top 10 entries for Instance Count for All Classes
 (excluding platform) shows:

 2088223 instances of class org.apache.cassandra.db.BufferCell
 1983245 instances of class
 org.apache.cassandra.db.composites.CompoundSparseCellName
 1885974 instances of class
 org.apache.cassandra.db.composites.CompoundDenseCellName
 63 instances of class
 org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State
 90704 instances of class
 org.apache.cassandra.utils.concurrent.Ref$GlobalState
 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

 At the bottom of the page, it shows:
 Total of 8739510 instances occupying 193607512 bytes.
 JFYI.

 Kunal

 On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';

 CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


 2. I don't know - how do I check? As I mentioned, I just installed the
 dsc21 update from datastax's debian repo (ver 2.1.7).

 Really appreciate your help.

 Thanks,
 Kunal

 On 10 July 2015 at 23:33, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look
 at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out
 the issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM)

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

Thanks, Sebastian.

Couple of questions (I'm really new to cassandra):
1. How do I interpret the output of 'nodetool cfstats' to figure out the
issues? Any documentation pointer on that would be helpful.

2. I'm primarily a python/c developer - so, totally clueless about JVM
environment. So, please bare with me as I would need a lot of hand-holding.
Should I just copy+paste the settings you gave and try to restart the
failing cassandra server?

Thanks,
Kunal

On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data model is bad, you are going to have to re-design it no matter
 what.

 #2 As a possible workaround try using the G1GC allocator with the settings
 from c* 3.0 instead of CMS. I've seen lots of success with it lately (tl;dr
 G1GC is much simpler than CMS and almost as good as a finely tuned CMS).
 *Note:* Use it with the latest Java 8 from Oracle. Do *not* set the
 newgen size for G1 sets it dynamically:

 # min and max heap sizes should be set to the same value to avoid
 # stop-the-world GC pauses during resize, and so that we can lock the
 # heap in memory on startup to prevent any of it from being swapped
 # out.
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

 # Per-thread stack size.
 JVM_OPTS=$JVM_OPTS -Xss256k

 # Use the Hotspot garbage-first collector.
 JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

 # Have the JVM do less remembered set work during STW, instead
 # preferring concurrent GC. Reduces p99.9 latency.
 JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
 # Machines with  10 cores may need additional threads.
 # Increase to = full cores (do not count HT cores).
 #JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=16
 #JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=16

 # Main G1GC tunable: lowering the pause target will lower throughput and
 vise versa.
 # 200ms is the JVM default and lowest viable setting
 # 1000ms increases throughput. Keep it smaller than the timeouts in
 cassandra.yaml.
 JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=500
 # Do reference processing in parallel GC.
 JVM_OPTS=$JVM_OPTS -XX:+ParallelRefProcEnabled

 # This may help eliminate STW.
 # The default in Hotspot 8u40 is 40%.
 #JVM_OPTS=$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25

 # For workloads that do large allocations, increasing the region
 # size may make things more efficient. Otherwise, let the JVM
 # set this automatically.
 #JVM_OPTS=$JVM_OPTS -XX:G1HeapRegionSize=32m

 # Make sure all memory is faulted and zeroed on startup.
 # This helps prevent soft faults in containers and makes
 # transparent hugepage allocation more effective.
 JVM_OPTS=$JVM_OPTS -XX:+AlwaysPreTouch

 # Biased locking does not benefit Cassandra.
 JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

 # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103

 # Enable thread-local allocation blocks and allow the JVM to automatically
 # resize them at runtime.
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

 # http://www.evanjones.ca/jvm-mmap-pause.html
 JVM_OPTS=$JVM_OPTS -XX:+PerfDisableSharedMem


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 I upgraded my instance from 8GB to a 14GB one.
 Allocated 8GB to jvm heap in cassandra-env.sh.

 And now, it crashes even faster with an OOM..

 Earlier, with 4GB heap, I could go upto ~90% replication completion (as
 reported by nodetool netstats); now, with 8GB heap, I cannot even get
 there. I've already restarted cassandra service 4 times with 8GB heap.

 No clue what's going on.. :(

 Kunal

 On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com
 wrote:

 You, and only you, are responsible for knowing your

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez

1. You want to look at # of sstables in cfhistograms or in cfstats look at:
Compacted partition maximum bytes
Maximum live cells per slice

2) No, here's the env.sh from 3.0 which should work with some tweaks:
https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

You'll at least have to modify the jamm version to what's in yours. I think
it's 2.5

All the best,

[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

Thanks, Sebastian.

Couple of questions (I'm really new to cassandra):
1. How do I interpret the output of 'nodetool cfstats' to figure out the
issues? Any documentation pointer on that would be helpful.

2. I'm primarily a python/c developer - so, totally clueless about JVM
environment. So, please bare with me as I would need a lot of hand-holding.
Should I just copy+paste the settings you gave and try to restart the
failing cassandra server?

Thanks,
Kunal

On 10 July 2015 at 22:35, Sebastian Estevez
sebastian.este...@datastax.com wrote:

#1 You need more information.

a) Take a look at your .hprof file (memory heap from the OOM) with an
introspection tool like jhat or visualvm or java flight recorder and see
what is using up your RAM.

b) How big are your large rows (use nodetool cfstats on each node). If
your data model is bad, you are going to have to re-design it no matter
what.

#2 As a possible workaround try using the G1GC allocator with the
settings from c* 3.0 instead of CMS. I've seen lots of success with it
lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely
tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not*
set the newgen size for G1 sets it dynamically:

# min and max heap sizes should be set to the same value to avoid
# stop-the-world GC pauses during resize, and so that we can lock the
# heap in memory on startup to prevent any of it from being swapped
# out.
JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

# Per-thread stack size.
JVM_OPTS=$JVM_OPTS -Xss256k

# Use the Hotspot garbage-first collector.
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

# Have the JVM do less remembered set work during STW, instead
# preferring concurrent GC. Reduces p99.9 latency.
JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

# The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
# Machines with 10 cores may need additional threads.
# Increase to = full cores (do not count HT cores).
#JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=16
#JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=16

# Main G1GC tunable: lowering the pause target will lower throughput and
vise versa.
# 200ms is the JVM default and lowest viable setting
# 1000ms increases throughput. Keep it smaller than the timeouts in
cassandra.yaml.
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=500
# Do reference processing in parallel GC.
JVM_OPTS=$JVM_OPTS -XX:+ParallelRefProcEnabled

# This may help eliminate STW.
# The default in Hotspot 8u40 is 40%.
#JVM_OPTS=$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25

# For workloads that do large allocations, increasing the region
# size may make things more efficient. Otherwise, let the JVM
# set this automatically.
#JVM_OPTS=$JVM_OPTS -XX:G1HeapRegionSize=32m

# Make sure all memory is faulted and zeroed on startup.
# This helps prevent soft faults in containers and makes
# transparent hugepage allocation more effective.
JVM_OPTS=$JVM_OPTS -XX:+AlwaysPreTouch

# Biased locking does not benefit Cassandra.
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103

# Enable thread-local allocation blocks and allow the JVM to
automatically
# resize them at runtime.
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

# http://www.evanjones.ca/jvm-mmap-pause.html
JVM_OPTS=$JVM_OPTS -XX:+PerfDisableSharedMem

All the best,

[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 |

Re: Overwhelming tombstones with LCS

2015-07-10 Thread Dan Kinder

On Sun, Jul 5, 2015 at 1:40 PM, Roman Tkachenko ro...@mailgunhq.com wrote:

 Hey guys,

 I have a table with RF=3 and LCS. Data model makes use of wide rows. A
 certain query run against this table times out and tracing reveals the
 following error on two out of three nodes:

 *Scanned over 10 tombstones; query aborted (see
 tombstone_failure_threshold)*

 This basically means every request with CL higher than one fails.

 I have two questions:

 * How could it happen that only two out of three nodes have overwhelming
 tombstones? For the third node tracing shows sensible *Read 815 live and
 837 tombstoned cells* traces.


One theory: before 2.1.6 compactions on wide rows with lots of tombstones
could take forever or potentially never finish. What version of Cassandra
are you on? It may be that you got lucky with one node that has been able
to keep up but the others haven't been able to.



 * Anything I can do to fix those two nodes? I have already set gc_grace to
 1 day and tried to make compaction strategy more aggressive
 (unchecked_tombstone_compaction - true, tombstone_threshold - 0.01) to no
 avail - a couple of days have already passed and it still gives the same
 error.


You probably want major compaction which is coming soon for LCS (
https://issues.apache.org/jira/browse/CASSANDRA-7272) but not here yet.

The alternative is, if you have enough time and headroom (this is going to
do some pretty serious compaction so be careful), alter your table to STCS,
let it compact into one SSTable, then convert back to LCS. It's pretty
heavy-handed but as long as your gc_grace is low enough it'll do the job.
Definitely do NOT do this if you have many tombstones in single wide rows
and are not 2.1.6



 Thanks!

 Roman




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

From jhat output, top 10 entries for Instance Count for All Classes
(excluding platform) shows:

2088223 instances of class org.apache.cassandra.db.BufferCell
1983245 instances of class
org.apache.cassandra.db.composites.CompoundSparseCellName
1885974 instances of class
org.apache.cassandra.db.composites.CompoundDenseCellName
63 instances of class
org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
503687 instances of class org.apache.cassandra.db.BufferDeletedCell
378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
101800 instances of class org.apache.cassandra.utils.concurrent.Ref
101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State
90704 instances of class
org.apache.cassandra.utils.concurrent.Ref$GlobalState
71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

At the bottom of the page, it shows:
Total of 8739510 instances occupying 193607512 bytes.
JFYI.

Kunal

On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';

 CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


 2. I don't know - how do I check? As I mentioned, I just installed the
 dsc21 update from datastax's debian repo (ver 2.1.7).

 Really appreciate your help.

 Thanks,
 Kunal

 On 10 July 2015 at 23:33, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look
 at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out the
 issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data

DROP Table

2015-07-10 Thread Saladi Naidu

My understanding is that Cassandra File Structure follows below naming 
convention
/cassandra/data/      key-spaces table 



Whereas our file structure is as below, each table has multiple names and when 
we drop tables and recreate these directories remain. Also when we dropped the 
table one node was down, when it came back, we tried to do Nodetool repair and 
repair kept failing  referring to CFID error listed below

drwxr-xr-x. 16 cass cass 4096 May 24 06:49 ../

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09application_by_user-e0eec95019a211e58b954ffc8e9bfaa6/

drwxr-xr-x.  2 cass cass 4096 Jun 25 
10:15application_info-4dba2bf0054f11e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09application_info-a0ee65d019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09configproperties-228ea2e0c13811e4aa1d4ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_activation-95d005f019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  3 cass cass 4096 Jun 25 
10:16user_app_permission-9fddcd62ffbe11e4a25a45259f96ec68/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_credential-86cfff1019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_info-2fa076221b1011e58b954ffc8e9bfaa6/

drwxr-xr-x.  2 cass cass 4096 Jun 25 
10:15user_info-36028c00054f11e58b954ffc8e9bfaa6/

drwxr-xr-x.  3 cass cass 4096 Jun 25 
10:15user_info-fe1d7b101a5711e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jun 25 
10:16user_role-9ed0ca30ffbe11e4b71d09335ad2d5a9/



WARN [Thread-2579] 2015-07-02 16:02:27,523 IncomingTcpConnection.java:91 
-UnknownColumnFamilyException reading from socket; closing

org.apache.cassandra.db.UnknownColumnFamilyException:Couldn't 
findcfId=218e3c90-1b0e-11e5-a34b-d7c17b3e318a

   
atorg.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:272)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:168)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:150)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:82)~[apache-cassandra-2.1.2.jar:2.1.2]

  Naidu Saladi

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar

I upgraded my instance from 8GB to a 14GB one.
Allocated 8GB to jvm heap in cassandra-env.sh.

And now, it crashes even faster with an OOM..

Earlier, with 4GB heap, I could go upto ~90% replication completion (as
reported by nodetool netstats); now, with 8GB heap, I cannot even get
there. I've already restarted cassandra service 4 times with 8GB heap.

No clue what's going on.. :(

Kunal

On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com wrote:

 You, and only you, are responsible for knowing your data and data model.

 If columns per row or rows per partition can be large, then an 8GB system
 is probably too small. But the real issue is that you need to keep your
 partition size from getting too large.

 Generally, an 8GB system is okay, but only for reasonably-sized
 partitions, like under 10MB.


 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 I'm new to cassandra
 How do I find those out? - mainly, the partition params that you asked
 for. Others, I think I can figure out.

 We don't have any large objects/blobs in the column values - it's all
 textual, date-time, numeric and uuid data.

 We use cassandra to primarily store segmentation data - with segment type
 as partition key. That is again divided into two separate column families;
 but they have similar structure.

 Columns per row can be fairly large - each segment type as the row key
 and associated user ids and timestamp as column value.

 Thanks,
 Kunal

 On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com
 wrote:

 What does your data and data model look like - partition size, rows per
 partition, number of columns per row, any large values/blobs in column
 values?

 You could run fine on an 8GB system, but only if your rows and
 partitions are reasonably small. Any large partitions could blow you away.

 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've
 restarted the cassandra service like 8-10 times hoping that it would 
 finish
 the replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21
 version 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * 
 RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez

#1 You need more information.

a) Take a look at your .hprof file (memory heap from the OOM) with an
introspection tool like jhat or visualvm or java flight recorder and see
what is using up your RAM.

b) How big are your large rows (use nodetool cfstats on each node). If your
data model is bad, you are going to have to re-design it no matter what.

#2 As a possible workaround try using the G1GC allocator with the settings
from c* 3.0 instead of CMS. I've seen lots of success with it lately (tl;dr
G1GC is much simpler than CMS and almost as good as a finely tuned CMS).
*Note:* Use it with the latest Java 8 from Oracle. Do *not* set the newgen
size for G1 sets it dynamically:

# min and max heap sizes should be set to the same value to avoid
 # stop-the-world GC pauses during resize, and so that we can lock the
 # heap in memory on startup to prevent any of it from being swapped
 # out.
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

 # Per-thread stack size.
 JVM_OPTS=$JVM_OPTS -Xss256k

 # Use the Hotspot garbage-first collector.
 JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

 # Have the JVM do less remembered set work during STW, instead
 # preferring concurrent GC. Reduces p99.9 latency.
 JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
 # Machines with  10 cores may need additional threads.
 # Increase to = full cores (do not count HT cores).
 #JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=16
 #JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=16

 # Main G1GC tunable: lowering the pause target will lower throughput and
 vise versa.
 # 200ms is the JVM default and lowest viable setting
 # 1000ms increases throughput. Keep it smaller than the timeouts in
 cassandra.yaml.
 JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=500
 # Do reference processing in parallel GC.
 JVM_OPTS=$JVM_OPTS -XX:+ParallelRefProcEnabled

 # This may help eliminate STW.
 # The default in Hotspot 8u40 is 40%.
 #JVM_OPTS=$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25

 # For workloads that do large allocations, increasing the region
 # size may make things more efficient. Otherwise, let the JVM
 # set this automatically.
 #JVM_OPTS=$JVM_OPTS -XX:G1HeapRegionSize=32m

 # Make sure all memory is faulted and zeroed on startup.
 # This helps prevent soft faults in containers and makes
 # transparent hugepage allocation more effective.
 JVM_OPTS=$JVM_OPTS -XX:+AlwaysPreTouch

 # Biased locking does not benefit Cassandra.
 JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

 # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103

 # Enable thread-local allocation blocks and allow the JVM to automatically
 # resize them at runtime.
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

 # http://www.evanjones.ca/jvm-mmap-pause.html
 JVM_OPTS=$JVM_OPTS -XX:+PerfDisableSharedMem


All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar 
kgangakhed...@gmail.com wrote:

 I upgraded my instance from 8GB to a 14GB one.
 Allocated 8GB to jvm heap in cassandra-env.sh.

 And now, it crashes even faster with an OOM..

 Earlier, with 4GB heap, I could go upto ~90% replication completion (as
 reported by nodetool netstats); now, with 8GB heap, I cannot even get
 there. I've already restarted cassandra service 4 times with 8GB heap.

 No clue what's going on.. :(

 Kunal

 On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com wrote:

 You, and only you, are responsible for knowing your data and data model.

 If columns per row or rows per partition can be large, then an 8GB system
 is probably too small. But the real issue is that you need to keep your
 partition size from getting too large.

 Generally, an 8GB system is okay, but only for reasonably-sized
 partitions, like under 10MB.


 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 I'm new to cassandra
 How do I find those out? - mainly, the partition params that you asked
 for. Others, I think I can figure out.

 We don't have any large

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez

#1

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.


460 is high, I like to keep my partitions under 100mb when possible. I've
seen worse though. The fix is to add something else (maybe month or week or
something) into your partition key:

 PRIMARY KEY ((segment_type, something_else), date, user_id, sess_id)

#2 looks like your jam version is 3 per your env.sh so you're probably okay
to copy the env.sh over from the C* 3.0 link I shared once you uncomment
and tweak the MAX_HEAP. If there's something wrong your node won't come up.
tail your logs.



All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 2:44 PM, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 And here is my cassandra-env.sh
 https://gist.github.com/kunalg/2c092cb2450c62be9a20

 Kunal

 On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 From jhat output, top 10 entries for Instance Count for All Classes
 (excluding platform) shows:

 2088223 instances of class org.apache.cassandra.db.BufferCell
 1983245 instances of class
 org.apache.cassandra.db.composites.CompoundSparseCellName
 1885974 instances of class
 org.apache.cassandra.db.composites.CompoundDenseCellName
 63 instances of class
 org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State

 90704 instances of class
 org.apache.cassandra.utils.concurrent.Ref$GlobalState
 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

 At the bottom of the page, it shows:
 Total of 8739510 instances occupying 193607512 bytes.
 JFYI.

 Kunal

 On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';

 CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


 2. I don't know - how do I check? As I mentioned, I just installed the
 dsc21 update from datastax's debian repo (ver 2.1.7).

 Really appreciate your help.

 Thanks,
 Kunal

 On 10 July 2015 at 23:33, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look
 at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in

Re: DROP Table

2015-07-10 Thread Sebastian Estevez

#1 The cause of this problem is a CREATE TABLE statement collision. Do
*not* generate
tables dynamically from multiple clients, even with IF NOT EXISTS. First
thing you need to do is fix your code so that this does not happen. Just
create your tables manually from cqlsh allowing time for the schema to
settle.

#2 Here's the fix:

1) *Change your code to not automatically re-create tables (even with IF
NOT EXISTS).*

2) Run a rolling restart to ensure schema matches across nodes. Run
nodetool describecluster around your cluster. Check that there is only one
schema version.

ON EACH NODE:
3) Check your filesystem and see if you have two directories for the table
in question in the data directory.

If THERE ARE TWO OR MORE DIRECTORIES:
4)Identify from schema_column_families which cf ID is the new one
(currently in use).

cqlsh -e select * from system.schema_column_families|grep table name

5) Move the data from the old one to the new one and remove the old
directory.

6) If there are multiple old ones repeat 5 for every old directory.

7) run nodetool refresh

IF THERE IS ONLY ONE DIRECTORY:

No further action is needed.

All the best,

[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

http://cassandrasummit-datastax.com/

On Fri, Jul 10, 2015 at 12:15 PM, Saladi Naidu naidusp2...@yahoo.com
wrote:

My understanding is that Cassandra File Structure follows below naming
convention

/cassandra/*data/ key-spaces table*

Whereas our file structure is as below, each table has multiple names and
when we drop tables and recreate these directories remain. Also when we
dropped the table one node was down, when it came back, we tried to do
Nodetool repair and repair kept failing referring to CFID error listed
below

drwxr-xr-x. 16 cass cass 4096 May 24 06:49 ../
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
application_by_user-e0eec95019a211e58b954ffc8e9bfaa6/
drwxr-xr-x. 2 cass cass 4096 Jun 25 10:15 application_info-
4dba2bf0054f11e58b954ffc8e9bfaa6/
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
application_info-a0ee65d019a311e58b954ffc8e9bfaa6/
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
configproperties-228ea2e0c13811e4aa1d4ffc8e9bfaa6/
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
user_activation-95d005f019a311e58b954ffc8e9bfaa6/
drwxr-xr-x. 3 cass cass 4096 Jun 25 10:16
user_app_permission-9fddcd62ffbe11e4a25a45259f96ec68/
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
user_credential-86cfff1019a311e58b954ffc8e9bfaa6/
drwxr-xr-x. 4 cass cass 4096 Jul 2 11:09
user_info-2fa076221b1011e58b954ffc8e9bfaa6/
drwxr-xr-x. 2 cass cass 4096 Jun 25 10:15
user_info-36028c00054f11e58b954ffc8e9bfaa6/
drwxr-xr-x. 3 cass cass 4096 Jun 25 10:15
user_info-fe1d7b101a5711e58b954ffc8e9bfaa6/
drwxr-xr-x. 4 cass cass 4096 Jun 25 10:16
user_role-9ed0ca30ffbe11e4b71d09335ad2d5a9/

WARN [Thread-2579] 2015-07-02 16:02:27,523 IncomingTcpConnection.java:91
- UnknownColumnFamilyException reading from socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=218e3c90-1b0e-11e5-a34b-d7c17b3e318a
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:272)
~[apache-cassandra-2.1.2.jar:2.1.2]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:168)
~[apache-cassandra-2.1.2.jar:2.1.2]
at

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra counters

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Overwhelming tombstones with LCS

Re: Cassandra OOM on joining existing ring

DROP Table

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: Cassandra OOM on joining existing ring

Re: DROP Table

18 matches

Site Navigation

Mail list logo

Footer information