Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Shravan C
We run C* at 32 GB and all servers have 96GB RAM. We use STCS . LCS is not an 
option for us as we have frequent updates.


Thanks,
Shravan

From: Thakrar, Jayesh 
Sent: Friday, March 3, 2017 3:47:27 PM
To: Joaquin Casares; user@cassandra.apache.org
Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time


Had been fighting a similar battle, but am now over the hump for most part.



Get info on the server config (e.g. memory, cpu, free memory (free -g), etc)

Run "nodetool info" on the nodes to get heap and off-heap sizes

Run "nodetool tablestats" or "nodetool tablestats ." on the 
key large tables

Essentially the purpose is to see if you really had a true OOM or was your 
machine running out of memory.



Cassandra can use offheap memory very well - so "nodetool info" will give you 
both heap and offheap.



Also, what is the compaction strategy of your tables?



Personally, I have found STCS to be awful at large scale - when you have 
sstables that are 100+ GB in size.

See 
https://issues.apache.org/jira/browse/CASSANDRA-10821?focusedCommentId=15389451&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15389451



LCS seems better and should be the default (my opinion) unless you want DTCS



A good description of all three compactions is here - 
http://docs.scylladb.com/kb/compaction/

Documentation
docs.scylladb.com
Scylla is a Cassandra-compatible NoSQL data store that can handle 1 million 
transactions per second on a single server.








From: Joaquin Casares 
Date: Friday, March 3, 2017 at 11:34 AM
To: 
Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time



Hello Shravan,



Typically asynchronous requests are recommended over batch statements since 
batch statements will cause more work on the coordinator node while individual 
requests, when using a TokenAwarePolicy, will hit a specific coordinator, 
perform a local disk seek, and return the requested information.



The only times that using batch statements are ideal is if writing to the same 
partition key, even if it's across multiple tables when using the same hashing 
algorithm (like murmur3).



Could you provide a bit of insight into what the batch statement was trying to 
accomplish and how many child statements were bundled up within that batch?



Cheers,



Joaquin


Joaquin Casares

Consultant

Austin, TX



Apache Cassandra Consulting

http://www.thelastpickle.com

The Last Pickle • Apache Cassandra Consulting & 
Services
www.thelastpickle.com
Apache Cassandra Consulting & Services. Our wealth of experience with Apache 
Cassandra will ensure success at all stages of a your project lifecycle.




On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch 
mailto:chall...@outlook.com>> wrote:

Hello,

More than 30 plus Cassandra servers in the primary DC went down OOM exception 
below. What puzzles me is the scale at which it happened (at the same minute). 
I will share some more details below.

System Log: http://pastebin.com/iPeYrWVR

GC Log: http://pastebin.com/CzNNGs0r

During the OOM I saw lot of WARNings like the below (these were there for quite 
sometime may be weeks)
WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252 - 
Batch of prepared statements for [keyspace.table] is of size 225455, exceeding 
specified threshold of 65536 by 159919.

Environment:
We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C* 
nodes on SSD and apps run here)  and secondary DC (geographically remote and 
more like a DR to primary) on SAS drives.
Cassandra config:

Java 1.8.0_65
Garbage Collector: G1GC
memtable_allocation_type: offheap_objects

Post this OOM I am seeing huge hints pile up on majority of the nodes and the 
pending hints keep going up. I have increased HintedHandoff CoreThreads to 6 
but that did not help (I admit that I tried this on one node to try).

nodetool compactionstats -H
pending tasks: 3
compaction typekeyspace  table   completed  
totalunit   progress
Compaction  system  hints 28.5 
GB   92.38 GB   bytes 30.85%


Appreciate your inputs here.

Thanks,

Shravan




Re: Question on configuring Apache Cassandra with LDAP

2017-03-03 Thread Daniel Kleviansky
That is in fact my linked WIP branch, and I can confirm progress was stalled 
just to other priorities being set!
It got to the point (in draft form) where authentication was working, but still 
need to work on extending IRoleManager.

I will say this, while I initially started using the Apache LDAP API, which is 
what’s present in that linked branch, I then started playing with UnboundID and 
Ldaptive (you can find that sandbox here: 
https://github.com/lqid/ldap-sandbox), which I found much easier to work with.

Having said all that, I believe DataStax are in fact using the Apache LDAP API, 
so clearly it is robust enough…
I’d love to start this back up to be honest, but business has priorities some 
very different things.
However, this has been a good reminder that I should finish what I started, so 
thanks for that! ;)

Feel free to reach out if you have any questions for me!

Kindest regards,
Daniel

On 4 Mar 2017, 07:18 +1100, Sam Tunnicliffe , wrote:
> This is something that has been discussed before and there's an JIRA open for 
> it already, it looks like progress has stalled but you might get some 
> pointers from the linked WIP branch.
>
> https://issues.apache.org/jira/browse/CASSANDRA-12294
>
>
> > On Fri, Mar 3, 2017 at 7:45 PM, Harika Vangapelli -T (hvangape - AKRAYA INC 
> > at Cisco)  wrote:
> > > I am trying to configure Cassnadra with LDAP , and I am trying to write 
> > > code and want to extend the functionality of 
> > > org.apache.cassandra.auth.PasswordAuthenticator and override authenticate 
> > > method but As PlainTextSaslAuthenticator innerclass has a private scope 
> > > not able to use method overriding.
> > >
> > > Please let us know what is the best way to implement LDAP security with 
> > > Apache Cassandra (or) a work around for this.
> > >
> > > Thanks,
> > > Harika
> > >
> > >
> > > Harika Vangapelli
> > > Engineer - IT
> > > hvang...@cisco.com
> > > Tel:
> > > Cisco Systems, Inc.
> > >
> > >
> > >
> > > United States
> > > cisco.com
> > >
> > > Think before you print.
> > > This email may contain confidential and privileged material for the sole 
> > > use of the intended recipient. Any review, use, distribution or 
> > > disclosure by others is strictly prohibited. If you are not the intended 
> > > recipient (or authorized to receive for the recipient), please contact 
> > > the sender by reply email and delete all copies of this message.
> > > Please click here for Company Registration Information.
> > >
>


Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Thakrar, Jayesh
Had been fighting a similar battle, but am now over the hump for most part.

Get info on the server config (e.g. memory, cpu, free memory (free -g), etc)
Run "nodetool info" on the nodes to get heap and off-heap sizes
Run "nodetool tablestats" or "nodetool tablestats ." on the 
key large tables
Essentially the purpose is to see if you really had a true OOM or was your 
machine running out of memory.

Cassandra can use offheap memory very well - so "nodetool info" will give you 
both heap and offheap.

Also, what is the compaction strategy of your tables?

Personally, I have found STCS to be awful at large scale - when you have 
sstables that are 100+ GB in size.
See 
https://issues.apache.org/jira/browse/CASSANDRA-10821?focusedCommentId=15389451&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15389451

LCS seems better and should be the default (my opinion) unless you want DTCS

A good description of all three compactions is here - 
http://docs.scylladb.com/kb/compaction/



From: Joaquin Casares 
Date: Friday, March 3, 2017 at 11:34 AM
To: 
Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time

Hello Shravan,

Typically asynchronous requests are recommended over batch statements since 
batch statements will cause more work on the coordinator node while individual 
requests, when using a TokenAwarePolicy, will hit a specific coordinator, 
perform a local disk seek, and return the requested information.

The only times that using batch statements are ideal is if writing to the same 
partition key, even if it's across multiple tables when using the same hashing 
algorithm (like murmur3).

Could you provide a bit of insight into what the batch statement was trying to 
accomplish and how many child statements were bundled up within that batch?

Cheers,

Joaquin

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch 
mailto:chall...@outlook.com>> wrote:
Hello,

More than 30 plus Cassandra servers in the primary DC went down OOM exception 
below. What puzzles me is the scale at which it happened (at the same minute). 
I will share some more details below.
System Log: http://pastebin.com/iPeYrWVR
GC Log: http://pastebin.com/CzNNGs0r

During the OOM I saw lot of WARNings like the below (these were there for quite 
sometime may be weeks)
WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252 - 
Batch of prepared statements for [keyspace.table] is of size 225455, exceeding 
specified threshold of 65536 by 159919.

Environment:
We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C* 
nodes on SSD and apps run here)  and secondary DC (geographically remote and 
more like a DR to primary) on SAS drives.
Cassandra config:

Java 1.8.0_65
Garbage Collector: G1GC
memtable_allocation_type: offheap_objects

Post this OOM I am seeing huge hints pile up on majority of the nodes and the 
pending hints keep going up. I have increased HintedHandoff CoreThreads to 6 
but that did not help (I admit that I tried this on one node to try).
nodetool compactionstats -H
pending tasks: 3
compaction typekeyspace  table   completed  
totalunit   progress
Compaction  system  hints 28.5 
GB   92.38 GB   bytes 30.85%


Appreciate your inputs here.
Thanks,
Shravan



Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Shravan C
Hi Joaquin,


We have inserts going into a tracking table. Tracking table is a simple table 
[PRIMARY KEY (comid, status_timestamp) ] with a few tracking attributes and 
sorted by status_timestamp. From a volume perspective it is not a whole lot.


Thanks,
Shravan

From: Joaquin Casares 
Sent: Friday, March 3, 2017 11:34:58 AM
To: user@cassandra.apache.org
Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time

Hello Shravan,

Typically asynchronous requests are recommended over batch statements since 
batch statements will cause more work on the coordinator node while individual 
requests, when using a TokenAwarePolicy, will hit a specific coordinator, 
perform a local disk seek, and return the requested information.

The only times that using batch statements are ideal is if writing to the same 
partition key, even if it's across multiple tables when using the same hashing 
algorithm (like murmur3).

Could you provide a bit of insight into what the batch statement was trying to 
accomplish and how many child statements were bundled up within that batch?

Cheers,

Joaquin

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch 
mailto:chall...@outlook.com>> wrote:
Hello,

More than 30 plus Cassandra servers in the primary DC went down OOM exception 
below. What puzzles me is the scale at which it happened (at the same minute). 
I will share some more details below.

System Log: http://pastebin.com/iPeYrWVR
GC Log: http://pastebin.com/CzNNGs0r

During the OOM I saw lot of WARNings like the 
below (these were there for quite sometime may be weeks)
WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252 - 
Batch of prepared statements for [keyspace.table] is of size 225455, exceeding 
specified threshold of 65536 by 159919.

Environment:
We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C* 
nodes on SSD and apps run here)  and secondary DC (geographically remote and 
more like a DR to primary) on SAS drives.
Cassandra config:

Java 1.8.0_65
Garbage Collector: G1GC
memtable_allocation_type: offheap_objects

Post this OOM I am seeing huge hints pile up on majority of the nodes and the 
pending hints keep going up. I have increased HintedHandoff CoreThreads to 6 
but that did not help (I admit that I tried this on one node to try).

nodetool compactionstats -H
pending tasks: 3
compaction typekeyspace  table   completed  
totalunit   progress
Compaction  system  hints 28.5 
GB   92.38 GB   bytes 30.85%


Appreciate your inputs here.

Thanks,
Shravan



Secondary indexes don't return expected results in 2.2.3

2017-03-03 Thread Enrique Bautista
Hello everyone,

I run a 2.2.3 cluster and I've found that sometimes queries by secondary
indexes don't return the expected results, even though the data is in the
table. I'm sure the data is in there because queries by partition key or
without WHERE clause do return the expected data.

The only issues I could find that sound similar to what I'm experiencing
are these:
https://issues.apache.org/jira/browse/CASSANDRA-6782
https://issues.apache.org/jira/browse/CASSANDRA-8206

In my case, there are updates with TTL involved too. However, the changelog
shows that those issues are supposedly fixed in 2.2.3, so it could be a red
herring.

Does anybody know of other related issues around this that might have been
fixed in later releases? It definitely sounds like a bug, but I haven't
been able to find anything else on the matter.

Thanks in advance.

Kind regards,
-- 
Enrique Bautista
Software Engineer
OpenJaw Technologies

-- 
*The information in this e-mail and any attachments is confidential. It is 
intended solely for the addressee(s) named above. If you are not an 
intended recipient, please notify the sender and delete the message and any 
attachments from your system. Any use, copying or disclosure of the 
contents is unauthorised unless expressly permitted. Any views expressed in 
this message are those of the sender unless expressly stated to be those of 
OpenJaw. Virus checking of emails and attachments is the responsibility of 
the recipient and OpenJaw cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments. OpenJaw 
Technologies Limited is a limited company registered in Ireland having 
registration number 353613 and registered office at Grattan Bridge House, 1 
Ormond Quay Upper, Dublin 7.*


Re: Question on configuring Apache Cassandra with LDAP

2017-03-03 Thread Sam Tunnicliffe
This is something that has been discussed before and there's an JIRA open
for it already, it looks like progress has stalled but you might get some
pointers from the linked WIP branch.

https://issues.apache.org/jira/browse/CASSANDRA-12294


On Fri, Mar 3, 2017 at 7:45 PM, Harika Vangapelli -T (hvangape - AKRAYA INC
at Cisco)  wrote:

> I am trying to configure Cassnadra with LDAP , and I am trying to write
> code and want to extend the functionality of org.apache.cassandra.auth.
> PasswordAuthenticator and override authenticate method but As
> PlainTextSaslAuthenticator innerclass has a private scope not able to use
> method overriding.
>
>
>
> Please let us know what is the best way to implement LDAP security with
> Apache Cassandra (or) a work around for this.
>
>
>
> Thanks,
>
> Harika
>
>
>
> [image:
> http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png]
>
>
>
> *Harika Vangapelli*
>
> Engineer - IT
>
> hvang...@cisco.com
>
> Tel:
>
> *Cisco Systems, Inc.*
>
>
>
>
> United States
> cisco.com
>
>
>
> [image: http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here
>  for
> Company Registration Information.
>
>
>


Question on configuring Apache Cassandra with LDAP

2017-03-03 Thread Harika Vangapelli -T (hvangape - AKRAYA INC at Cisco)
I am trying to configure Cassnadra with LDAP , and I am trying to write code 
and want to extend the functionality of 
org.apache.cassandra.auth.PasswordAuthenticator and override authenticate 
method but As PlainTextSaslAuthenticator innerclass has a private scope not 
able to use method overriding.

Please let us know what is the best way to implement LDAP security with Apache 
Cassandra (or) a work around for this.

Thanks,
Harika

[http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png]



Harika Vangapelli
Engineer - IT
hvang...@cisco.com
Tel:

Cisco Systems, Inc.



United States
cisco.com


[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
Please click 
here for 
Company Registration Information.




Re: question of keyspace that just disappeared

2017-03-03 Thread George Webster
I think it does on drop keyspace. We had a recent enough snapshot so it
wasn't a big deal to recover. However, we didn't have a snapshot for when
the keyspace disappeared.

@Romain: I believe you are correct about reliability. We just had a repair
--full fail and CPU lock up one of the nodes at 100%. This occurred on a
fairly new keyspace that only have writes. We also are now receiving a very
high percentage of read timeouts. ... might be time to rebuild the cluster.



On Fri, Mar 3, 2017 at 2:34 PM, Edward Capriolo 
wrote:

>
> On Fri, Mar 3, 2017 at 7:56 AM, Romain Hardouin 
> wrote:
>
>> I suspect a lack of 3.x reliability. Cassandra could had gave up with
>> dropped messages but not with a "drop keyspace". I mean I already saw some
>> spark jobs with too much executors that produce a high load average on a
>> DC. I saw a C* node with a 1 min. load avg of 140 that can still have a P99
>> read latency at 40ms. But I never saw a disappearing keyspace. There are
>> old tickets regarding C* 1.x but as far as I remember it was due to a
>> create/drop/create keyspace.
>>
>>
>> Le Vendredi 3 mars 2017 13h44, George Webster  a
>> écrit :
>>
>>
>> Thank you for your reply and good to know about the debug statement. I
>> haven't
>>
>> We never dropped or re-created the keyspace before. We haven't even
>> performed writes to that keyspace in months. I also checked the permissions
>> of Apache, that user had read only access.
>>
>> Unfortunately, I reverted from a backend recently. I cannot say for sure
>> anymore if I saw something in system before the revert.
>>
>> Anyway, hopefully it was just a fluke. We have some crazy ML libraries
>> running on it maybe Cassandra just gave up? Ohh well, Cassandra is a a
>> champ and we haven't really had issues with it before.
>>
>> On Thu, Mar 2, 2017 at 6:51 PM, Romain Hardouin 
>> wrote:
>>
>> Did you inspect system tables to see if there is some traces of your
>> keyspace? Did you ever drop and re-create this keyspace before that?
>>
>> Lines in debug appear because fd interval is > 2 seconds (logs are in
>> nanoseconds). You can override intervals via -Dcassandra.fd_initial_value_
>> ms and -Dcassandra.fd_max_interval_ms properties. Are you sure you didn't
>> have these lines in debug logs before? I used to see them a lot prior to
>> increase intervals to 4 seconds.
>>
>> Best,
>>
>> Romain
>>
>> Le Mardi 28 février 2017 18h25, George Webster  a
>> écrit :
>>
>>
>> Hey Cassandra Users,
>>
>> We recently encountered an issue with a keyspace just disappeared. I was
>> curious if anyone has had this occur before and can provide some insight.
>>
>> We are using cassandra 3.10. 2 DCs  3 nodes each.
>> The data was still located in the storage folder but is not located
>> inside Cassandra
>>
>> I searched the logs for any hints of error or commands being executed
>> that could have caused a loss of a keyspace. Unfortunately I found nothing.
>> In the logs the only unusual issue i saw was a series of read timeouts that
>> occurred right around when the keyspace went away. Since then I see
>> numerous entries in debug log as the following:
>>
>> DEBUG [GossipStage:1] 2017-02-28 18:14:12,580 FailureDetector.java:457 -
>> Ignoring interval time of 2155674599 for /x.x.x..12
>> DEBUG [GossipStage:1] 2017-02-28 18:14:16,580 FailureDetector.java:457 -
>> Ignoring interval time of 2945213745 for /x.x.x.81
>> DEBUG [GossipStage:1] 2017-02-28 18:14:19,590 FailureDetector.java:457 -
>> Ignoring interval time of 2006530862 for /x.x.x..69
>> DEBUG [GossipStage:1] 2017-02-28 18:14:27,434 FailureDetector.java:457 -
>> Ignoring interval time of 3441841231 for /x.x.x.82
>> DEBUG [GossipStage:1] 2017-02-28 18:14:29,588 FailureDetector.java:457 -
>> Ignoring interval time of 2153964846 for /x.x.x.82
>> DEBUG [GossipStage:1] 2017-02-28 18:14:33,582 FailureDetector.java:457 -
>> Ignoring interval time of 2588593281 for /x.x.x.82
>> DEBUG [GossipStage:1] 2017-02-28 18:14:37,588 FailureDetector.java:457 -
>> Ignoring interval time of 2005305693 for /x.x.x.69
>> DEBUG [GossipStage:1] 2017-02-28 18:14:38,592 FailureDetector.java:457 -
>> Ignoring interval time of 2009244850 for /x.x.x.82
>> DEBUG [GossipStage:1] 2017-02-28 18:14:43,584 FailureDetector.java:457 -
>> Ignoring interval time of 2149192677 for /x.x.x.69
>> DEBUG [GossipStage:1] 2017-02-28 18:14:45,605 FailureDetector.java:457 -
>> Ignoring interval time of 2021180918 for /x.x.x.85
>> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
>> Ignoring interval time of 2436026101 for /x.x.x.81
>> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
>> Ignoring interval time of 2436187894 for /x.x.x.82
>>
>> During the time of the disappearing keyspace we had two concurrent
>> activities:
>> 1) Running a Spark job (via HDP 2.5.3 in Yarn) that was performing a
>> countbykey. It was using they Keyspace that disappeared. The operation
>> crashed.
>> 2) We created a new keyspace to test out scheme. Only "fanc

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Joaquin Casares
Hello Shravan,

Typically asynchronous requests are recommended over batch statements since
batch statements will cause more work on the coordinator node while
individual requests, when using a TokenAwarePolicy, will hit a specific
coordinator, perform a local disk seek, and return the requested
information.

The only times that using batch statements are ideal is if writing to the
same partition key, even if it's across multiple tables when using the same
hashing algorithm (like murmur3).

Could you provide a bit of insight into what the batch statement was trying
to accomplish and how many child statements were bundled up within that
batch?

Cheers,

Joaquin

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch  wrote:

> Hello,
>
> More than 30 plus Cassandra servers in the primary DC went down OOM
> exception below. What puzzles me is the scale at which it happened (at the
> same minute). I will share some more details below.
>
> System Log: http://pastebin.com/iPeYrWVR
> GC Log: http://pastebin.com/CzNNGs0r
>
> During the OOM I saw lot of WARNings like
> the below (these were there for quite sometime may be weeks)
> *WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209
> BatchStatement.java:252 - Batch of prepared statements for [keyspace.table]
> is of size 225455, exceeding specified threshold of 65536 by 159919.*
>
> * Environment:*
> We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more
> C* nodes on SSD and apps run here)  and secondary DC (geographically remote
> and more like a DR to primary) on SAS drives.
> Cassandra config:
>
> Java 1.8.0_65
> Garbage Collector: G1GC
> memtable_allocation_type: offheap_objects
>
> Post this OOM I am seeing huge hints pile up on majority of the nodes and
> the pending hints keep going up. I have increased HintedHandoff
> CoreThreads to 6 but that did not help (I admit that I tried this on one
> node to try).
>
> nodetool compactionstats -H
> pending tasks: 3
> compaction typekeyspace  table
> completed  totalunit   progress
> Compaction  system  hints
> 28.5 GB   92.38 GB   bytes 30.85%
>
>
> Appreciate your inputs here.
>
> Thanks,
> Shravan
>


OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Shravan Ch
Hello,

More than 30 plus Cassandra servers in the primary DC went down OOM exception 
below. What puzzles me is the scale at which it happened (at the same minute). 
I will share some more details below.

System Log: http://pastebin.com/iPeYrWVR
GC Log: http://pastebin.com/CzNNGs0r

During the OOM I saw lot of WARNings like the 
below (these were there for quite sometime may be weeks)
WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252 - 
Batch of prepared statements for [keyspace.table] is of size 225455, exceeding 
specified threshold of 65536 by 159919.

Environment:
We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C* 
nodes on SSD and apps run here)  and secondary DC (geographically remote and 
more like a DR to primary) on SAS drives.
Cassandra config:

Java 1.8.0_65
Garbage Collector: G1GC
memtable_allocation_type: offheap_objects

Post this OOM I am seeing huge hints pile up on majority of the nodes and the 
pending hints keep going up. I have increased HintedHandoff CoreThreads to 6 
but that did not help (I admit that I tried this on one node to try).

nodetool compactionstats -H
pending tasks: 3
compaction typekeyspace  table   completed  
totalunit   progress
Compaction  system  hints 28.5 
GB   92.38 GB   bytes 30.85%


Appreciate your inputs here.

Thanks,
Shravan


New command line client for cassandra-reaper

2017-03-03 Thread Vincent Rischmann
Hi,



I'm using cassandra-reaper
(https://github.com/thelastpickle/cassandra-reaper) to manage repairs of
my Cassandra clusters, probably like a bunch of other people.


When I started using it (it was still the version from the spotify
repository) the UI didn't work well, and the Python cli client was slow
to use because we had to use Docker to run it.


It was frustrating for me so over a couple of weeks I wrote
https://github.com/vrischmann/happyreaper which is another CLI client.


It doesn't do much more than spreaper (there are some client-side
filters that spreaper doesn't have I think), the main benefit is that
it's a self-contained binary without needing a Python environment.


I'm announcing it here in case it's of interest to anyone. If anyone has
feedback feel free to share.


Vincent.

 


Re: question of keyspace that just disappeared

2017-03-03 Thread Edward Capriolo
On Fri, Mar 3, 2017 at 7:56 AM, Romain Hardouin  wrote:

> I suspect a lack of 3.x reliability. Cassandra could had gave up with
> dropped messages but not with a "drop keyspace". I mean I already saw some
> spark jobs with too much executors that produce a high load average on a
> DC. I saw a C* node with a 1 min. load avg of 140 that can still have a P99
> read latency at 40ms. But I never saw a disappearing keyspace. There are
> old tickets regarding C* 1.x but as far as I remember it was due to a
> create/drop/create keyspace.
>
>
> Le Vendredi 3 mars 2017 13h44, George Webster  a
> écrit :
>
>
> Thank you for your reply and good to know about the debug statement. I
> haven't
>
> We never dropped or re-created the keyspace before. We haven't even
> performed writes to that keyspace in months. I also checked the permissions
> of Apache, that user had read only access.
>
> Unfortunately, I reverted from a backend recently. I cannot say for sure
> anymore if I saw something in system before the revert.
>
> Anyway, hopefully it was just a fluke. We have some crazy ML libraries
> running on it maybe Cassandra just gave up? Ohh well, Cassandra is a a
> champ and we haven't really had issues with it before.
>
> On Thu, Mar 2, 2017 at 6:51 PM, Romain Hardouin 
> wrote:
>
> Did you inspect system tables to see if there is some traces of your
> keyspace? Did you ever drop and re-create this keyspace before that?
>
> Lines in debug appear because fd interval is > 2 seconds (logs are in
> nanoseconds). You can override intervals via -Dcassandra.fd_initial_value_
> ms and -Dcassandra.fd_max_interval_ms properties. Are you sure you didn't
> have these lines in debug logs before? I used to see them a lot prior to
> increase intervals to 4 seconds.
>
> Best,
>
> Romain
>
> Le Mardi 28 février 2017 18h25, George Webster  a
> écrit :
>
>
> Hey Cassandra Users,
>
> We recently encountered an issue with a keyspace just disappeared. I was
> curious if anyone has had this occur before and can provide some insight.
>
> We are using cassandra 3.10. 2 DCs  3 nodes each.
> The data was still located in the storage folder but is not located inside
> Cassandra
>
> I searched the logs for any hints of error or commands being executed that
> could have caused a loss of a keyspace. Unfortunately I found nothing. In
> the logs the only unusual issue i saw was a series of read timeouts that
> occurred right around when the keyspace went away. Since then I see
> numerous entries in debug log as the following:
>
> DEBUG [GossipStage:1] 2017-02-28 18:14:12,580 FailureDetector.java:457 -
> Ignoring interval time of 2155674599 for /x.x.x..12
> DEBUG [GossipStage:1] 2017-02-28 18:14:16,580 FailureDetector.java:457 -
> Ignoring interval time of 2945213745 for /x.x.x.81
> DEBUG [GossipStage:1] 2017-02-28 18:14:19,590 FailureDetector.java:457 -
> Ignoring interval time of 2006530862 for /x.x.x..69
> DEBUG [GossipStage:1] 2017-02-28 18:14:27,434 FailureDetector.java:457 -
> Ignoring interval time of 3441841231 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:29,588 FailureDetector.java:457 -
> Ignoring interval time of 2153964846 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:33,582 FailureDetector.java:457 -
> Ignoring interval time of 2588593281 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:37,588 FailureDetector.java:457 -
> Ignoring interval time of 2005305693 for /x.x.x.69
> DEBUG [GossipStage:1] 2017-02-28 18:14:38,592 FailureDetector.java:457 -
> Ignoring interval time of 2009244850 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:43,584 FailureDetector.java:457 -
> Ignoring interval time of 2149192677 for /x.x.x.69
> DEBUG [GossipStage:1] 2017-02-28 18:14:45,605 FailureDetector.java:457 -
> Ignoring interval time of 2021180918 for /x.x.x.85
> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
> Ignoring interval time of 2436026101 for /x.x.x.81
> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
> Ignoring interval time of 2436187894 for /x.x.x.82
>
> During the time of the disappearing keyspace we had two concurrent
> activities:
> 1) Running a Spark job (via HDP 2.5.3 in Yarn) that was performing a
> countbykey. It was using they Keyspace that disappeared. The operation
> crashed.
> 2) We created a new keyspace to test out scheme. Only "fancy" thing in
> that keyspace are a few material view tables. Data was being loaded into
> that keyspace during the crash. The load process was extracting information
> and then just writing to Cassandra.
>
> Any ideas? Anyone seen this before?
>
> Thanks,
> George
>
>
>
>
>
>
Cassandra takes snapshots for certain events. Does this extend to drop
keyspace commands? Maybe it should.


Re: question of keyspace that just disappeared

2017-03-03 Thread Romain Hardouin
I suspect a lack of 3.x reliability. Cassandra could had gave up with dropped 
messages but not with a "drop keyspace". I mean I already saw some spark jobs 
with too much executors that produce a high load average on a DC. I saw a C* 
node with a 1 min. load avg of 140 that can still have a P99 read latency at 
40ms. But I never saw a disappearing keyspace. There are old tickets regarding 
C* 1.x but as far as I remember it was due to a create/drop/create keyspace.

Le Vendredi 3 mars 2017 13h44, George Webster  a écrit 
:
 

 Thank you for your reply and good to know about the debug statement. I haven't 
 
We never dropped or re-created the keyspace before. We haven't even performed 
writes to that keyspace in months. I also checked the permissions of Apache, 
that user had read only access. 
Unfortunately, I reverted from a backend recently. I cannot say for sure 
anymore if I saw something in system before the revert. 
Anyway, hopefully it was just a fluke. We have some crazy ML libraries running 
on it maybe Cassandra just gave up? Ohh well, Cassandra is a a champ and we 
haven't really had issues with it before. 
On Thu, Mar 2, 2017 at 6:51 PM, Romain Hardouin  wrote:

Did you inspect system tables to see if there is some traces of your keyspace? 
Did you ever drop and re-create this keyspace before that?
Lines in debug appear because fd interval is > 2 seconds (logs are in 
nanoseconds). You can override intervals via -Dcassandra.fd_initial_value_ ms 
and -Dcassandra.fd_max_interval_ms properties. Are you sure you didn't have 
these lines in debug logs before? I used to see them a lot prior to increase 
intervals to 4 seconds. 
Best,
Romain

Le Mardi 28 février 2017 18h25, George Webster  a 
écrit :
 

 Hey Cassandra Users,
We recently encountered an issue with a keyspace just disappeared. I was 
curious if anyone has had this occur before and can provide some insight. 
We are using cassandra 3.10. 2 DCs  3 nodes each. The data was still located in 
the storage folder but is not located inside Cassandra
I searched the logs for any hints of error or commands being executed that 
could have caused a loss of a keyspace. Unfortunately I found nothing. In the 
logs the only unusual issue i saw was a series of read timeouts that occurred 
right around when the keyspace went away. Since then I see numerous entries in 
debug log as the following:
DEBUG [GossipStage:1] 2017-02-28 18:14:12,580 FailureDetector.java:457 - 
Ignoring interval time of 2155674599 for /x.x.x..12DEBUG [GossipStage:1] 
2017-02-28 18:14:16,580 FailureDetector.java:457 - Ignoring interval time of 
2945213745 for /x.x.x.81DEBUG [GossipStage:1] 2017-02-28 18:14:19,590 
FailureDetector.java:457 - Ignoring interval time of 2006530862 for 
/x.x.x..69DEBUG [GossipStage:1] 2017-02-28 18:14:27,434 
FailureDetector.java:457 - Ignoring interval time of 3441841231 for 
/x.x.x.82DEBUG [GossipStage:1] 2017-02-28 18:14:29,588 FailureDetector.java:457 
- Ignoring interval time of 2153964846 for /x.x.x.82DEBUG [GossipStage:1] 
2017-02-28 18:14:33,582 FailureDetector.java:457 - Ignoring interval time of 
2588593281 for /x.x.x.82DEBUG [GossipStage:1] 2017-02-28 18:14:37,588 
FailureDetector.java:457 - Ignoring interval time of 2005305693 for 
/x.x.x.69DEBUG [GossipStage:1] 2017-02-28 18:14:38,592 FailureDetector.java:457 
- Ignoring interval time of 2009244850 for /x.x.x.82DEBUG [GossipStage:1] 
2017-02-28 18:14:43,584 FailureDetector.java:457 - Ignoring interval time of 
2149192677 for /x.x.x.69DEBUG [GossipStage:1] 2017-02-28 18:14:45,605 
FailureDetector.java:457 - Ignoring interval time of 2021180918 for 
/x.x.x.85DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 
- Ignoring interval time of 2436026101 for /x.x.x.81DEBUG [GossipStage:1] 
2017-02-28 18:14:46,432 FailureDetector.java:457 - Ignoring interval time of 
2436187894 for /x.x.x.82
During the time of the disappearing keyspace we had two concurrent 
activities:1) Running a Spark job (via HDP 2.5.3 in Yarn) that was performing a 
countbykey. It was using they Keyspace that disappeared. The operation 
crashed.2) We created a new keyspace to test out scheme. Only "fancy" thing in 
that keyspace are a few material view tables. Data was being loaded into that 
keyspace during the crash. The load process was extracting information and then 
just writing to Cassandra. 
Any ideas? Anyone seen this before?
Thanks,George

   



   

Re: question of keyspace that just disappeared

2017-03-03 Thread George Webster
Thank you for your reply and good to know about the debug statement. I
haven't

We never dropped or re-created the keyspace before. We haven't even
performed writes to that keyspace in months. I also checked the permissions
of Apache, that user had read only access.

Unfortunately, I reverted from a backend recently. I cannot say for sure
anymore if I saw something in system before the revert.

Anyway, hopefully it was just a fluke. We have some crazy ML libraries
running on it maybe Cassandra just gave up? Ohh well, Cassandra is a a
champ and we haven't really had issues with it before.

On Thu, Mar 2, 2017 at 6:51 PM, Romain Hardouin  wrote:

> Did you inspect system tables to see if there is some traces of your
> keyspace? Did you ever drop and re-create this keyspace before that?
>
> Lines in debug appear because fd interval is > 2 seconds (logs are in
> nanoseconds). You can override intervals via -Dcassandra.fd_initial_value_ms
> and -Dcassandra.fd_max_interval_ms properties. Are you sure you didn't have
> these lines in debug logs before? I used to see them a lot prior to
> increase intervals to 4 seconds.
>
> Best,
>
> Romain
>
> Le Mardi 28 février 2017 18h25, George Webster  a
> écrit :
>
>
> Hey Cassandra Users,
>
> We recently encountered an issue with a keyspace just disappeared. I was
> curious if anyone has had this occur before and can provide some insight.
>
> We are using cassandra 3.10. 2 DCs  3 nodes each.
> The data was still located in the storage folder but is not located inside
> Cassandra
>
> I searched the logs for any hints of error or commands being executed that
> could have caused a loss of a keyspace. Unfortunately I found nothing. In
> the logs the only unusual issue i saw was a series of read timeouts that
> occurred right around when the keyspace went away. Since then I see
> numerous entries in debug log as the following:
>
> DEBUG [GossipStage:1] 2017-02-28 18:14:12,580 FailureDetector.java:457 -
> Ignoring interval time of 2155674599 <(215)%20567-4599> for /x.x.x..12
> DEBUG [GossipStage:1] 2017-02-28 18:14:16,580 FailureDetector.java:457 -
> Ignoring interval time of 2945213745 for /x.x.x.81
> DEBUG [GossipStage:1] 2017-02-28 18:14:19,590 FailureDetector.java:457 -
> Ignoring interval time of 2006530862 for /x.x.x..69
> DEBUG [GossipStage:1] 2017-02-28 18:14:27,434 FailureDetector.java:457 -
> Ignoring interval time of 3441841231 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:29,588 FailureDetector.java:457 -
> Ignoring interval time of 2153964846 <(215)%20396-4846> for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:33,582 FailureDetector.java:457 -
> Ignoring interval time of 2588593281 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:37,588 FailureDetector.java:457 -
> Ignoring interval time of 2005305693 for /x.x.x.69
> DEBUG [GossipStage:1] 2017-02-28 18:14:38,592 FailureDetector.java:457 -
> Ignoring interval time of 2009244850 for /x.x.x.82
> DEBUG [GossipStage:1] 2017-02-28 18:14:43,584 FailureDetector.java:457 -
> Ignoring interval time of 2149192677 <(214)%20919-2677> for /x.x.x.69
> DEBUG [GossipStage:1] 2017-02-28 18:14:45,605 FailureDetector.java:457 -
> Ignoring interval time of 2021180918 for /x.x.x.85
> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
> Ignoring interval time of 2436026101 for /x.x.x.81
> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 -
> Ignoring interval time of 2436187894 for /x.x.x.82
>
> During the time of the disappearing keyspace we had two concurrent
> activities:
> 1) Running a Spark job (via HDP 2.5.3 in Yarn) that was performing a
> countbykey. It was using they Keyspace that disappeared. The operation
> crashed.
> 2) We created a new keyspace to test out scheme. Only "fancy" thing in
> that keyspace are a few material view tables. Data was being loaded into
> that keyspace during the crash. The load process was extracting information
> and then just writing to Cassandra.
>
> Any ideas? Anyone seen this before?
>
> Thanks,
> George
>
>
>


Re: Attached profiled data but need help understanding it

2017-03-03 Thread Romain Hardouin
Also, I should have mentioned that it would be a good idea to spawn your three 
benchmark instances in the same AZ, then try with one instance on each AZ to 
see how network latency affects your LWT rate. The lower latency is achievable 
with three instances on the same placement group of course but it's kinda 
dangerous for production.