[jira] [Commented] (CASSANDRA-12572) Indexed values are limited to 64Kb

2016-09-01 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455759#comment-15455759
 ] 

Jose Martinez Poblete commented on CASSANDRA-12572:
---

[~slebresne] This is no longer an issue on C* 3.0.3  
Thanks for your help!

> Indexed values are limited to 64Kb
> --
>
> Key: CASSANDRA-12572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12572
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 2.1.14.1346 
> Cassandra 3.0.8.1284
>Reporter: Jose Martinez Poblete
>Priority: Minor
>
> Currently, a query is bound to a 64Kb text limit
> In some edge scenarios, the query text could go over that limit
> Can we make that a configurable parameter?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12572) CQL query size bound to 64Kb

2016-08-30 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-12572:
-

 Summary: CQL query size bound to 64Kb
 Key: CASSANDRA-12572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12572
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.1.14.1346 
Cassandra 3.0.8.1284
Reporter: Jose Martinez Poblete


Currently, a query is bound to a 64Kb text limit
In some edge scenarios, the query text could go over that limit
Can we make that a configurable parameter?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12079) CQLSH to retrieve column names from data file header

2016-06-23 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-12079:
-

 Summary: CQLSH to retrieve column names from data file header
 Key: CASSANDRA-12079
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12079
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.1.14.1346
Reporter: Jose Martinez Poblete


Suppose a have a table with 3 columns
Then the data is copied to a delimited file with *HEADER*

{noformat}
cqlsh> create KEYSPACE my_keyspace WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': 1 } AND durable_writes = 'true';
cqlsh> use my_keyspace ;
cqlsh:my_keyspace> CREATE TABLE my_table ( col1 int PRIMARY KEY, col2 text, 
col3 text );
cqlsh:my_keyspace> insert INTO my_table (col1, col2) VALUES ( 1, '1st row') ;
cqlsh:my_keyspace> insert INTO my_table (col1, col2) VALUES ( 2, '2nd row') ;
cqlsh:my_keyspace> insert INTO my_table (col1, col2) VALUES ( 3, '3rd row') ;
cqlsh:my_keyspace> COPY my_keyspace.my_table ( col1, col2 ) TO 'my_table.dat' 
WITH DELIMITER = '|' AND HEADER = true ;
Reading options from the command line: {'header': 'true', 'delimiter': '|'}
Using 3 child processes

Starting copy of my_keyspace.my_table with columns ['col1', 'col2'].
Processed: 3 rows; Rate:  10 rows/s; Avg. rate:   4 rows/s
3 rows exported to 1 files in 0.861 seconds.
{noformat}

This will create a file with these contents

{noformat}
col1|col2
3|3rd row
2|2nd row
1|1st row
{noformat}

Then we create another table with same DDL 

{noformat}
cqlsh:my_keyspace> CREATE TABLE my_table2 ( col1 int PRIMARY KEY, col2 text, 
col3 text );
{noformat}

A restore from the recently created delimited file *with header* data file WILL 
FAIL because no columns were specified so it is expecting all columns to be in 
the delimited file - but we have a header row and the header option was 
specified

{noformat}
cqlsh:my_keyspace> COPY my_table2 FROM 'my_table.dat' WITH DELIMITER = '|' AND 
HEADER = true ;
Reading options from the command line: {'header': 'true', 'delimiter': '|'}
Using 3 child processes

Starting copy of my_keyspace.my_table2 with columns ['col1', 'col2', 'col3'].
Failed to import 3 rows: ParseError - Invalid row length 2 should be 3,  given 
up without retries
Failed to process 3 rows; failed rows written to 
import_my_keyspace_my_table2.err
Processed: 3 rows; Rate:   5 rows/s; Avg. rate:   7 rows/s
3 rows imported from 1 files in 0.442 seconds (0 skipped).
{noformat}

Provided that *HEADER = true*,  It would be very handy if CQLSH looks into the 
*header row* and retrieves the column names so they do not have to be entered 
manually on the copy command - especially where there is a significant number 
of columns

{noformat}
cqlsh:my_keyspace> COPY my_table2 (col1, col2) FROM 'my_table.dat' WITH 
DELIMITER = '|' AND HEADER = true ;
Reading options from the command line: {'header': 'true', 'delimiter': '|'}
Using 3 child processes

Starting copy of my_keyspace.my_table2 with columns ['col1', 'col2'].
Processed: 3 rows; Rate:   3 rows/s; Avg. rate:   4 rows/s
3 rows imported from 1 files in 0.708 seconds (0 skipped).
cqlsh:my_keyspace> select * from my_table2;

 col1 | col2| col3
--+-+--
1 | 1st row | null
2 | 2nd row | null
3 | 3rd row | null

(3 rows)
cqlsh:my_keyspace> 
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11941) Add text option for cassandra cassandra-stress

2016-06-01 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-11941:
-

 Summary: Add text option for cassandra cassandra-stress  
 Key: CASSANDRA-11941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11941
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: C* 2.1.13
Reporter: Jose Martinez Poblete


Currently, we are able to specify a fixed length of a text field on a YAML file 
 as follows:

{noformat}
  - name: text_column
size: fixed(100) 
{noformat}

That would fill our column with random characters

{noformat}
cqlsh:stresscql> select text_column from stresscql.msgindex limit 2;

 text_column
---
  
D[\x04C[HA([o\rVae$-\x02wfC$\x00X)U\x11\x15o,zEG\\tsw)\x0b-}c\4\x15D\x0f\x1e{h[y7\x11(DIL\x12*\x01\x1fU:bRN:_T\x10\x7feN;NS\x19j?>K.q\x01dcB\x00t-nj!3;zsM1y
 
ITb\x1bC4\x14>\x18R8\x14>M\x027|Oh\x007?\n\x164N'|ox9mBFM3\x16hq\x06}K.\x1aZM4MG$\r7X"\x0c\t\x1fX~Z3\x04~Q\x17$\x0eB4[xUc.X\x0e\x1fQ?:\x7fa\x0bl\x0b\x11Ug\x12TKP-;gv#)\F

(2 rows)

{noformat}

For some test cases, that would be OK

But for some other cases we would like to have the option to take words from a 
source like /usr/share/dict/words to make up for the content up to - or 
slightly less - the amount in bytes specified for the column for testing data 
models that require actual words separated by a space 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10999) Implement mechanism in CQL to fetch a range of bytes from a blob object

2016-01-11 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-10999:
-

 Summary: Implement mechanism in CQL to fetch a range of bytes from 
a blob object
 Key: CASSANDRA-10999
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10999
 Project: Cassandra
  Issue Type: Improvement
  Components: CQL
 Environment: Cassandra 2.1.11
Reporter: Jose Martinez Poblete


We are using Cassandra as a backing store for IMAP
IMAP has a byte range fetch feature which mail clients do use for previews, 
especially of large objects.

For the cases where we have large objects stored on the database, we would like 
to retrieve a byte subset rather than the whole object (could be a blob or 
binary encoded as text)

Could a feature like this be implemented?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10797) Bootstrap new node fails with OOM when streaming nodes contains thousands of sstables

2015-12-01 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-10797:
-

 Summary: Bootstrap new node fails with OOM when streaming nodes 
contains thousands of sstables
 Key: CASSANDRA-10797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10797
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
 Environment: Cassandra 2.1.8.621 w/G1GC
Reporter: Jose Martinez Poblete
 Attachments: 112415_system.log, Heapdump_OOM.zip

When adding a new node to an existing DC, it runs OOM after 25-45 minutes
Upon heapdump revision, it is found the sending nodes are streaming thousands 
of sstables which in turns blows the bootstrapping node heap 

{noformat}
ERROR [RMI Scheduler(0)] 2015-11-24 10:10:44,585 JVMStabilityInspector.java:94 
- JVM state determined to be unstable.  Exiting forcefully due to:
java.lang.OutOfMemoryError: Java heap space
ERROR [STREAM-IN-/173.36.28.148] 2015-11-24 10:10:44,585 StreamSession.java:502 
- [Stream #0bb13f50-92cb-11e5-bc8d-f53b7528ffb4] Streaming error occurred
java.lang.IllegalStateException: Shutdown in progress
at 
java.lang.ApplicationShutdownHooks.remove(ApplicationShutdownHooks.java:82) 
~[na:1.8.0_65]
at java.lang.Runtime.removeShutdownHook(Runtime.java:239) ~[na:1.8.0_65]
at 
org.apache.cassandra.service.StorageService.removeShutdownHook(StorageService.java:747)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.utils.JVMStabilityInspector$Killer.killCurrentJVM(JVMStabilityInspector.java:95)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:64)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:66)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
 ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
ERROR [RMI TCP Connection(idle)] 2015-11-24 10:10:44,585 
JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
ERROR [OptionalTasks:1] 2015-11-24 10:10:44,585 CassandraDaemon.java:223 - 
Exception in thread Thread[OptionalTasks:1,5,main]
java.lang.IllegalStateException: Shutdown in progress
{noformat}

Attached is the Eclipse MAT report as a zipped web page





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10797) Bootstrap new node fails with OOM when streaming nodes contains thousands of sstables

2015-12-01 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-10797:
--
Attachment: Screen Shot 2015-12-01 at 7.34.40 PM.png

> Bootstrap new node fails with OOM when streaming nodes contains thousands of 
> sstables
> -
>
> Key: CASSANDRA-10797
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10797
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Cassandra 2.1.8.621 w/G1GC
>Reporter: Jose Martinez Poblete
> Attachments: 112415_system.log, Heapdump_OOM.zip, Screen Shot 
> 2015-12-01 at 7.34.40 PM.png
>
>
> When adding a new node to an existing DC, it runs OOM after 25-45 minutes
> Upon heapdump revision, it is found the sending nodes are streaming thousands 
> of sstables which in turns blows the bootstrapping node heap 
> {noformat}
> ERROR [RMI Scheduler(0)] 2015-11-24 10:10:44,585 
> JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
> forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
> ERROR [STREAM-IN-/173.36.28.148] 2015-11-24 10:10:44,585 
> StreamSession.java:502 - [Stream #0bb13f50-92cb-11e5-bc8d-f53b7528ffb4] 
> Streaming error occurred
> java.lang.IllegalStateException: Shutdown in progress
> at 
> java.lang.ApplicationShutdownHooks.remove(ApplicationShutdownHooks.java:82) 
> ~[na:1.8.0_65]
> at java.lang.Runtime.removeShutdownHook(Runtime.java:239) 
> ~[na:1.8.0_65]
> at 
> org.apache.cassandra.service.StorageService.removeShutdownHook(StorageService.java:747)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector$Killer.killCurrentJVM(JVMStabilityInspector.java:95)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:64)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:66)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
>  ~[cassandra-all-2.1.8.621.jar:2.1.8.621]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
> ERROR [RMI TCP Connection(idle)] 2015-11-24 10:10:44,585 
> JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
> forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
> ERROR [OptionalTasks:1] 2015-11-24 10:10:44,585 CassandraDaemon.java:223 - 
> Exception in thread Thread[OptionalTasks:1,5,main]
> java.lang.IllegalStateException: Shutdown in progress
> {noformat}
> Attached is the Eclipse MAT report as a zipped web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10401) json2sstable fails with NPE

2015-10-02 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941287#comment-14941287
 ] 

Jose Martinez Poblete commented on CASSANDRA-10401:
---

[~mambocab]
I tested as per your suggestions on Cassandra 2.1.9.791 and that seemed to work 
- THANK YOU!

{noformat}
-bash-4.1$ ls -la 
/mnt/ephemeral/cassandra/data/t20041/jira_10401-3572e1b069e5b05ce73685229767/t20041-jira_10401-ka-1-Data.db
ls: cannot access 
/mnt/ephemeral/cassandra/data/t20041/jira_10401-3572e1b069e5b05ce73685229767/t20041-jira_10401-ka-1-Data.db:
 No such file or directory
-bash-4.1$ json2sstable -K t20041 -c jira_10401 
/tmp/t20041-jira_10401-ka-1-Data.json 
/mnt/ephemeral/cassandra/data/t20041/jira_10401-3572e1b069e5b05ce73685229767/t20041-jira_10401-ka-1-Data.db
Importing 1 keys...
1 keys imported successfully.
-bash-4.1$ nodetool refresh t20041 jira_10401
-bash-4.1$ cqlsh -k t20041
Connected to DSE_4.8.0 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.9.791 | DSE 4.8.0 | CQL spec 3.2.0 | Native 
protocol v3]
Use HELP for help.
cqlsh:t20041> select * from jira_10401 ;

 col1 | col2 | col3 | col4
--+--+--+--
 This is col1 | This is col2 | This is col3 | This is col4

(1 rows)
cqlsh:t20041> exit;
-bash-4.1$ 
{noformat}

> json2sstable fails with NPE
> ---
>
> Key: CASSANDRA-10401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10401
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra 2.1.8.621
>Reporter: Jose Martinez Poblete
>
> We have the following table...
> {noformat}
> CREATE TABLE keyspace_name.table_name (
> col1 text,
> col2 text,
> col3 text,
> col4 text,
> PRIMARY KEY ((col1, col2), col3)
> ) WITH CLUSTERING ORDER BY (col3 ASC)
> {noformat}
> And the following  json in a file created from sstable2json tool
> {noformat}
> [
> {"key": "This is col1:This is col2,
>  "cells": [["This is col3:","",1443217787319002],
>["This is col3:"col4","This is col4",1443217787319002]]}
> ]
> {noformat}
> Let's say we deleted that record form the DB and wanted to bring it back
> If we try to create an sstable from this data in a json file named 
> test_file.json, we get a NPE 
> {noformat}
> -bash-4.1$ json2sstable -K elp -c table_name-3264cbe063c211e5bc34e746786b7b29 
> test_file.json  
> /var/lib/cassandra/data/keyspace_name/table_name-3264cbe063c211e5bc34e746786b7b29/keyspace_name-table_name-ka-1-Data.db
> Importing 1 keys...
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
>   at 
> org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:316)
>   at 
> org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:287)
>   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:514)
> ERROR: null
> -bash-4.1$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10401) Improve json2sstable error reporting on nonexistent column

2015-10-02 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941304#comment-14941304
 ] 

Jose Martinez Poblete commented on CASSANDRA-10401:
---

Perhaps the command could come up with a better error message?

> Improve json2sstable error reporting on nonexistent column
> --
>
> Key: CASSANDRA-10401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10401
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra 2.1.8.621
>Reporter: Jose Martinez Poblete
>
> We have the following table...
> {noformat}
> CREATE TABLE keyspace_name.table_name (
> col1 text,
> col2 text,
> col3 text,
> col4 text,
> PRIMARY KEY ((col1, col2), col3)
> ) WITH CLUSTERING ORDER BY (col3 ASC)
> {noformat}
> And the following  json in a file created from sstable2json tool
> {noformat}
> [
> {"key": "This is col1:This is col2,
>  "cells": [["This is col3:","",1443217787319002],
>["This is col3:"col4","This is col4",1443217787319002]]}
> ]
> {noformat}
> Let's say we deleted that record form the DB and wanted to bring it back
> If we try to create an sstable from this data in a json file named 
> test_file.json, we get a NPE 
> {noformat}
> -bash-4.1$ json2sstable -K elp -c table_name-3264cbe063c211e5bc34e746786b7b29 
> test_file.json  
> /var/lib/cassandra/data/keyspace_name/table_name-3264cbe063c211e5bc34e746786b7b29/keyspace_name-table_name-ka-1-Data.db
> Importing 1 keys...
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
>   at 
> org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:316)
>   at 
> org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:287)
>   at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:514)
> ERROR: null
> -bash-4.1$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10401) json2sstable fails with NPE

2015-09-28 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-10401:
--
Description: 
We have the following table...

{noformat}
CREATE TABLE keyspace_name.table_name (
col1 text,
col2 text,
col3 text,
col4 text,
PRIMARY KEY ((col1, col2), col3)
) WITH CLUSTERING ORDER BY (col3 ASC)
{noformat}

And the following  json in a file created from sstable2json tool

{noformat}
[
{"key": "This is col1:This is col2,
 "cells": [["This is col3:","",1443217787319002],
   ["This is col3:"col4","This is col4",1443217787319002]]}
]
{noformat}

Let's say we deleted that record form the DB and wanted to bring it back
If we try to create an sstable from this data in a json file named 
test_file.json, we get a NPE 

{noformat}
-bash-4.1$ json2sstable -K elp -c table_name-3264cbe063c211e5bc34e746786b7b29 
test_file.json  
/var/lib/cassandra/data/keyspace_name/table_name-3264cbe063c211e5bc34e746786b7b29/keyspace_name-table_name-ka-1-Data.db
Importing 1 keys...
java.lang.NullPointerException
at 
org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
at 
org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:316)
at 
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:287)
at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:514)
ERROR: null
-bash-4.1$
{noformat}

  was:
We have the following table...

{noformat}
CREATE TABLE elp.document (
business_area_ct text,
business_id text,
document_id text,
access_level_ct text,
annotation_tx text,
author_nm text,
business_id_type_ct text,
cms_id text,
direction_ct text,
document_code_id uuid,
file_metadata_map_nm map,
last_mod_ts timestamp,
last_mod_user_id text,
official_document_ts timestamp,
repository_logical_package_no int,
repository_offset_no int,
repository_package_handle_user_id text,
repository_package_nm text,
repository_package_sequence_no int,
repository_procedural_nm text,
review_complete_in boolean,
source_system_document_id text,
source_system_nm text,
status_cd text,
vendor_nm text,
PRIMARY KEY ((business_area_ct, business_id), document_id)
) WITH CLUSTERING ORDER BY (document_id ASC)
{noformat}

And the following  json in a file created from sstable2json tool

{noformat}
[
{"key": "This is business_area_ct:This is business_id",
 "cells": [["This is document_id:","",1443217787319002],
   ["This is document_id:author_nm","This is 
autor_nm",1443217787319002]]}
]
{noformat}

Let's say we deleted that record form the DB and wanted to bring it back
If we try to create an sstable from this json file,  get a NPE 

{noformat}
-bash-4.1$ json2sstable -K elp -c document-3264cbe063c211e5bc34e746786b7b29 
test2.json  
/var/lib/cassandra/data/elp/document-3264cbe063c211e5bc34e746786b7b29/elp-document-ka-1-Data.db
Importing 1 keys...
java.lang.NullPointerException
at 
org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
at 
org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:316)
at 
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:287)
at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:514)
ERROR: null
-bash-4.1$
{noformat}


> json2sstable fails with NPE
> ---
>
> Key: CASSANDRA-10401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10401
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra 2.1.8.621
>Reporter: Jose Martinez Poblete
>
> We have the following table...
> {noformat}
> CREATE TABLE keyspace_name.table_name (
> col1 text,
> col2 text,
> col3 text,
> col4 text,
> PRIMARY KEY ((col1, col2), col3)
> ) WITH CLUSTERING ORDER BY (col3 ASC)
> {noformat}
> And the following  json in a file created from sstable2json tool
> {noformat}
> [
> {"key": "This is col1:This is col2,
>  "cells": [["This is col3:","",1443217787319002],
>["This is col3:"col4","This is col4",1443217787319002]]}
> ]
> {noformat}
> Let's say we deleted that record form the DB and wanted to bring it back
> If we try to create an sstable from this data in a json file named 
> test_file.json, we get a NPE 
> {noformat}
> -bash-4.1$ json2sstable -K elp -c table_name-3264cbe063c211e5bc34e746786b7b29 
> test_file.json  
> /var/lib/cassandra/data/keyspace_name/table_name-3264cbe063c211e5bc34e746786b7b29/keyspace_name-table_name-ka-1-Data.db
> Importing 1 keys...
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
>   at 
> 

[jira] [Created] (CASSANDRA-10401) json2sstable fails with NPE

2015-09-26 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-10401:
-

 Summary: json2sstable fails with NPE
 Key: CASSANDRA-10401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10401
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Cassandra 2.1.8.621
Reporter: Jose Martinez Poblete


We have the following table...

{noformat}
CREATE TABLE elp.document (
business_area_ct text,
business_id text,
document_id text,
access_level_ct text,
annotation_tx text,
author_nm text,
business_id_type_ct text,
cms_id text,
direction_ct text,
document_code_id uuid,
file_metadata_map_nm map,
last_mod_ts timestamp,
last_mod_user_id text,
official_document_ts timestamp,
repository_logical_package_no int,
repository_offset_no int,
repository_package_handle_user_id text,
repository_package_nm text,
repository_package_sequence_no int,
repository_procedural_nm text,
review_complete_in boolean,
source_system_document_id text,
source_system_nm text,
status_cd text,
vendor_nm text,
PRIMARY KEY ((business_area_ct, business_id), document_id)
) WITH CLUSTERING ORDER BY (document_id ASC)
{noformat}

And the following  json in a file created from sstable2json tool

{noformat}
[
{"key": "This is business_area_ct:This is business_id",
 "cells": [["This is document_id:","",1443217787319002],
   ["This is document_id:author_nm","This is 
autor_nm",1443217787319002]]}
]
{noformat}

Let's say we deleted that record form the DB and wanted to bring it back
If we try to create an sstable from this json file,  get a NPE 

{noformat}
-bash-4.1$ json2sstable -K elp -c document-3264cbe063c211e5bc34e746786b7b29 
test2.json  
/var/lib/cassandra/data/elp/document-3264cbe063c211e5bc34e746786b7b29/elp-document-ka-1-Data.db
Importing 1 keys...
java.lang.NullPointerException
at 
org.apache.cassandra.tools.SSTableImport.getKeyValidator(SSTableImport.java:442)
at 
org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:316)
at 
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:287)
at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:514)
ERROR: null
-bash-4.1$
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10165) Query fails when batch_size_warn_threshold_in_kb is not set on cassandra.yaml

2015-08-24 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-10165:
-

 Summary: Query fails when batch_size_warn_threshold_in_kb is not 
set on cassandra.yaml
 Key: CASSANDRA-10165
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10165
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: C* 2.1.5
Reporter: Jose Martinez Poblete


Jobs failed with the following error:

{noformat}
ERROR [SharedPool-Worker-1] 2015-08-21 18:06:42,759  ErrorMessage.java:244 - 
Unexpected exception during request
java.lang.NullPointerException: null
at 
org.apache.cassandra.config.DatabaseDescriptor.getBatchSizeWarnThreshold(DatabaseDescriptor.java:855)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.verifyBatchSize(BatchStatement.java:239)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:311)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:296)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:282)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:503) 
~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$BatchStatementExecution.execute(DseQueryHandler.java:327)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$Operation.executeWithTiming(DseQueryHandler.java:223)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$Operation.executeWithAuditLogging(DseQueryHandler.java:259)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler.processBatch(DseQueryHandler.java:110)
 ~[dse.jar:4.7.0]
at 
org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:215)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_75]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[cassandra-all-2.1.5.469.jar:2.1.5.469]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
{noformat}

It turns there was no entry for *batch_size_warn_threshold_in_kb* on 
cassandra.yaml

Once we set that parameter on the file, the error went away

Can we please have C* assume this setting defaults to 64Kb without prejudice on 
the job if it's not specified on the yaml file?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10165) Query fails when batch_size_warn_threshold_in_kb is not set on cassandra.yaml

2015-08-24 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-10165:
--
Description: 
Jobs failed with the following error:

{noformat}
ERROR [SharedPool-Worker-1] 2015-08-21 18:06:42,759  ErrorMessage.java:244 - 
Unexpected exception during request
java.lang.NullPointerException: null
at 
org.apache.cassandra.config.DatabaseDescriptor.getBatchSizeWarnThreshold(DatabaseDescriptor.java:855)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.verifyBatchSize(BatchStatement.java:239)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:311)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:296)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:282)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:503) 
~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$BatchStatementExecution.execute(DseQueryHandler.java:327)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$Operation.executeWithTiming(DseQueryHandler.java:223)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$Operation.executeWithAuditLogging(DseQueryHandler.java:259)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler.processBatch(DseQueryHandler.java:110)
 ~[dse.jar:4.7.0]
at 
org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:215)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_75]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [cassandra-all-2.1.5.469.jar:2.1.5.469]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[cassandra-all-2.1.5.469.jar:2.1.5.469]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
{noformat}

It turns there was no entry for *batch_size_warn_threshold_in_kb* on 
cassandra.yaml

Once we set that parameter on the file, the error went away

Can we please have C* assume this setting assumes the default without prejudice 
on the job if it's not specified on the yaml file?

  was:
Jobs failed with the following error:

{noformat}
ERROR [SharedPool-Worker-1] 2015-08-21 18:06:42,759  ErrorMessage.java:244 - 
Unexpected exception during request
java.lang.NullPointerException: null
at 
org.apache.cassandra.config.DatabaseDescriptor.getBatchSizeWarnThreshold(DatabaseDescriptor.java:855)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.verifyBatchSize(BatchStatement.java:239)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:311)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:296)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:282)
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:503) 
~[cassandra-all-2.1.5.469.jar:2.1.5.469]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$BatchStatementExecution.execute(DseQueryHandler.java:327)
 ~[dse.jar:4.7.0]
at 
com.datastax.bdp.cassandra.cql3.DseQueryHandler$Operation.executeWithTiming(DseQueryHandler.java:223)
 ~[dse.jar:4.7.0]
at 

[jira] [Commented] (CASSANDRA-9294) Streaming errors should log the root cause

2015-05-04 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526938#comment-14526938
 ] 

Jose Martinez Poblete commented on CASSANDRA-9294:
--

It would be very useful to have instead of having to enable TCP logging after 
the fact

 Streaming errors should log the root cause
 --

 Key: CASSANDRA-9294
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9294
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Yuki Morishita
 Fix For: 2.0.x


 Currently, when a streaming error occurs all you get is something like:
 {noformat}
 java.util.concurrent.ExecutionException: 
 org.apache.cassandra.streaming.StreamException: Stream failed
 {noformat}
 Instead, we should log the root cause.  Was the connection reset by peer, did 
 it timeout, etc?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9119) Nodetool rebuild creates an additional rebuild session even if there is one already running

2015-04-03 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-9119:


 Summary: Nodetool rebuild creates an additional rebuild session 
even if there is one already running
 Key: CASSANDRA-9119
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9119
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.12.200
Cassandra 2.1.2.98
Reporter: Jose Martinez Poblete
 Fix For: 2.0.15, 2.1.5


If a nodetool rebuild session is started and the shell session is finished for 
whatever reason, a second nodetool rebuild will spawn a second rebuild 
filestream

{noformat}
DC2-S1-100-29:~ # ps aux | grep nodetool 
root 10304 0.0 0.0 4532 560 pts/3 S+ 05:23 0:00 grep nodetool 
dds-user 20946 0.0 0.0 21180 1880 ? S 04:39 0:00 /bin/sh 
/usr/share/dse/resources/cassandra/bin/nodetool rebuild group10  there is 
only one rebuild running

DC2-S1-100-29:~ # nodetool netstats | grep -v /var/local/cassandra 
Mode: NORMAL 
Rebuild 818307b0-d9ba-11e4-8d4c-7bce93ffad70 -- does this represent one 
rebuild? 
/10.96.100.22 
Receiving 63 files, 221542605741 bytes total 
/10.96.100.26 
Receiving 48 files, 47712285610 bytes total 
/10.96.100.25 
/10.96.100.23 
Receiving 57 files, 127515362783 bytes total 
/10.96.100.27 
/10.96.100.24 
Rebuild 7bf9fcd0-d9bb-11e4-8d4c-7bce93ffad70 --- does this represent a 
second rebuild? 
/10.96.100.25 
/10.96.100.26 
Receiving 56 files, 47717905924 bytes total 
/10.96.100.24 
/10.96.100.22 
Receiving 61 files, 221558642440 bytes total 
/10.96.100.23 
Receiving 62 files, 127528841272 bytes total 
/10.96.100.27 
Read Repair Statistics: 
Attempted: 0 
Mismatch (Blocking): 0 
Mismatch (Background): 0 
Pool Name Active Pending Completed 
Commands n/a 0 2151322 
Responses n/a 0 3343981
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-17 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215590#comment-14215590
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

Thanks [~jbellis]

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   
|   48 
 | 8,086,073,320 | 96.66%
 |  |- 

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-14 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212436#comment-14212436
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

OK, set write_request_timeout_in_ms: 2000
Also changed disk to RAID0 and we are able to drop a 10Gb file in 19 secs ~ 
1.8Tb/Hr
Asked him to use executeAsync on his loads
Still we are seeing these 

{noformat}
 INFO [ScheduledTasks:1] 2014-11-13 13:51:29,696 MessagingService.java (line 
875) 949 MUTATION messages dropped in last 5000ms
 INFO [ScheduledTasks:1] 2014-11-13 13:52:49,939 MessagingService.java (line 
875) 1378 MUTATION messages dropped in last 5000ms
 INFO [ScheduledTasks:1] 2014-11-13 13:53:17,215 MessagingService.java (line 
875) 2 MUTATION messages dropped in last 5000ms
 INFO [ScheduledTasks:1] 2014-11-13 13:54:10,277 MessagingService.java (line 
875) 1 MUTATION messages dropped in last 5000ms 
{noformat}

Perhaps raising memtable_flush_queue_size  un-throttle compaction to force 
faster disk flush ?

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 

[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-11-14 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Environment: 
 DSE version: 4.5.0 Cassandra 2.0.8


  was:
 
{noformat}
 INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
4.5.0
 INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop version: 
1.0.4.13
 INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
0.12.0.3
 INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
0.10.1
 INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
4.6.0.2.4
 INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
1.4.4.14.1
 INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout version: 
0.8
 INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
version: 3.0.2
 INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
0.9.1
 INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
0.9.1.1
 INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
 INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap size: 
6316621824/6316621824
{noformat}


 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment:  DSE version: 4.5.0 Cassandra 2.0.8
Reporter: Jose Martinez Poblete
Assignee: Jimmy MÃ¥rdell
 Fix For: 2.0.11, 2.1.1

 Attachments: 7983-v1.patch, gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Here is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at 

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-12 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208102#comment-14208102
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

Can't do JBOD right now as per CASSANDRA-7386 but will try RAID0

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   
|   48 
 | 8,086,073,320 | 96.66%

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-12 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208209#comment-14208209
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

[~jbellis]  Looking at the C* yaml file, seems range_request_timeout_in_ms was 
changed to 10X 
We will set back to 1

{noformat}
read_request_timeout_in_ms: 1
range_request_timeout_in_ms: 10  
write_request_timeout_in_ms: 1
cas_contention_timeout_in_ms: 1000
truncate_request_timeout_in_ms: 6
request_timeout_in_ms: 1
{noformat}


 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 

[jira] [Created] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-8295:


 Summary: Cassandra runs OOM @ 
java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 w/Cassandra 2.0.11.82

{noformat} 
 INFO 10:36:21,991 Logging initialized
 INFO 10:36:22,016 DSE version: 4.5.3
 INFO 10:36:22,016 Hadoop version: 1.0.4.13
 INFO 10:36:22,017 Hive version: 0.12.0.5
 INFO 10:36:22,017 Pig version: 0.10.1
 INFO 10:36:22,018 Solr version: 4.6.0.2.8
 INFO 10:36:22,019 Sqoop version: 1.4.4.14.2
 INFO 10:36:22,019 Mahout version: 0.8
 INFO 10:36:22,020 Appender version: 3.0.2
 INFO 10:36:22,020 Spark version: 0.9.1
 INFO 10:36:22,021 Shark version: 0.9.1.4
{noformat}
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
system.tgz.1, system.tgz.2, system.tgz.3

Customer runs a 3 node cluster 
Their dataset is less than 1Tb and during data load, one of the nodes enter a 
GC death spiral:

{noformat}
 INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) GC 
for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max is 
8375238656
...
 INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; max is 
8375238656
 INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) GC 
for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; max 
is 8375238656
{noformat} 

Their application waits 1 minute before a retry when a timeout is returned

This is what we find on their heapdumps:

{noformat}
Class Name  


 | Shallow Heap | 
Retained Heap | Percentage
-
org.apache.cassandra.db.Memtable @ 0x773f52f80  


 |   72 | 
8,086,073,504 | 96.66%
|- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8 


[jira] [Updated] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-8295:
-
Environment: 
DSE 4.5.3 Cassandra 2.0.11.82


  was:
Cassandra 2.0.11.82



 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   
|   48 
 | 8,086,073,320 | 96.66%
 |  |- 

[jira] [Updated] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-8295:
-
Environment: 
Cassandra 2.0.11.82


  was:
DSE 4.5.3 w/Cassandra 2.0.11.82

{noformat} 
 INFO 10:36:21,991 Logging initialized
 INFO 10:36:22,016 DSE version: 4.5.3
 INFO 10:36:22,016 Hadoop version: 1.0.4.13
 INFO 10:36:22,017 Hive version: 0.12.0.5
 INFO 10:36:22,017 Pig version: 0.10.1
 INFO 10:36:22,018 Solr version: 4.6.0.2.8
 INFO 10:36:22,019 Sqoop version: 1.4.4.14.2
 INFO 10:36:22,019 Mahout version: 0.8
 INFO 10:36:22,020 Appender version: 3.0.2
 INFO 10:36:22,020 Spark version: 0.9.1
 INFO 10:36:22,021 Shark version: 0.9.1.4
{noformat}


 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
  

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207203#comment-14207203
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

More info from MAT

{noformat}
Class Name  Objects Shallow Heap
java.nio.HeapByteBuffer
First 10 of 73,845,620 objects  73,845,620  3,544,589,760
edu.stanford.ppl.concurrent.SnapTreeMap$Node
First 10 of 34,614,044 objects  34,614,044  1,661,474,112
byte[]
First 10 of 3,969,475 objects   3,969,475   1,510,362,528
org.apache.cassandra.db.Column
First 10 of 34,614,043 objects  34,614,043  1,107,649,376
edu.stanford.ppl.concurrent.CopyOnWriteManager$COWEpoch
First 10 of 411,924 objects 411,924 39,544,704
java.nio.ByteBuffer[]
First 10 of 823,848 objects 823,848 30,913,568
long[]
First 10 of 411,924 objects 411,924 22,819,304
edu.stanford.ppl.concurrent.SnapTreeMap$RootHolder
First 10 of 411,924 objects 411,924 19,772,352
org.apache.cassandra.db.RangeTombstoneList
First 10 of 411,924 objects 411,924 16,476,960
int[]
First 10 of 411,924 objects 411,924 15,456,784
edu.stanford.ppl.concurrent.CopyOnWriteManager$Latch
First 10 of 411,924 objects 411,924 13,181,568
edu.stanford.ppl.concurrent.SnapTreeMap
First 10 of 411,924 objects 411,924 13,181,568
java.util.concurrent.atomic.AtomicReference
First 10 of 823,848 objects 823,848 13,181,568
java.util.concurrent.ConcurrentSkipListMap$Node
First 10 of 411,929 objects 411,929 9,886,296
org.apache.cassandra.db.DecoratedKey
First 10 of 411,928 objects 411,928 9,886,272
java.lang.Long
First 10 of 411,928 objects 411,928 9,886,272
org.apache.cassandra.db.AtomicSortedColumns
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.db.AtomicSortedColumns$Holder
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.db.DeletionInfo
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.dht.LongToken
First 10 of 411,928 objects 411,928 6,590,848
edu.stanford.ppl.concurrent.SnapTreeMap$COWMgr
First 10 of 411,924 objects 411,924 6,590,784
java.util.concurrent.ConcurrentSkipListMap$Index
First 10 of 207,065 objects 207,065 4,969,560
java.util.concurrent.ConcurrentSkipListMap$HeadIndex
First 10 of 16 objects  16  512
org.apache.cassandra.db.DeletedColumn
All 1 objects   1   32

Total: 24 entries
155,076,837 8,086,073,256
{noformat}

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java 

[jira] [Issue Comment Deleted] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-8295:
-
Comment: was deleted

(was: More info from MAT

{noformat}
Class Name  Objects Shallow Heap
java.nio.HeapByteBuffer
First 10 of 73,845,620 objects  73,845,620  3,544,589,760
edu.stanford.ppl.concurrent.SnapTreeMap$Node
First 10 of 34,614,044 objects  34,614,044  1,661,474,112
byte[]
First 10 of 3,969,475 objects   3,969,475   1,510,362,528
org.apache.cassandra.db.Column
First 10 of 34,614,043 objects  34,614,043  1,107,649,376
edu.stanford.ppl.concurrent.CopyOnWriteManager$COWEpoch
First 10 of 411,924 objects 411,924 39,544,704
java.nio.ByteBuffer[]
First 10 of 823,848 objects 823,848 30,913,568
long[]
First 10 of 411,924 objects 411,924 22,819,304
edu.stanford.ppl.concurrent.SnapTreeMap$RootHolder
First 10 of 411,924 objects 411,924 19,772,352
org.apache.cassandra.db.RangeTombstoneList
First 10 of 411,924 objects 411,924 16,476,960
int[]
First 10 of 411,924 objects 411,924 15,456,784
edu.stanford.ppl.concurrent.CopyOnWriteManager$Latch
First 10 of 411,924 objects 411,924 13,181,568
edu.stanford.ppl.concurrent.SnapTreeMap
First 10 of 411,924 objects 411,924 13,181,568
java.util.concurrent.atomic.AtomicReference
First 10 of 823,848 objects 823,848 13,181,568
java.util.concurrent.ConcurrentSkipListMap$Node
First 10 of 411,929 objects 411,929 9,886,296
org.apache.cassandra.db.DecoratedKey
First 10 of 411,928 objects 411,928 9,886,272
java.lang.Long
First 10 of 411,928 objects 411,928 9,886,272
org.apache.cassandra.db.AtomicSortedColumns
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.db.AtomicSortedColumns$Holder
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.db.DeletionInfo
First 10 of 411,924 objects 411,924 9,886,176
org.apache.cassandra.dht.LongToken
First 10 of 411,928 objects 411,928 6,590,848
edu.stanford.ppl.concurrent.SnapTreeMap$COWMgr
First 10 of 411,924 objects 411,924 6,590,784
java.util.concurrent.ConcurrentSkipListMap$Index
First 10 of 207,065 objects 207,065 4,969,560
java.util.concurrent.ConcurrentSkipListMap$HeadIndex
First 10 of 16 objects  16  512
org.apache.cassandra.db.DeletedColumn
All 1 objects   1   32

Total: 24 entries
155,076,837 8,086,073,256
{noformat})

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for 

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207409#comment-14207409
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

Customer uses 10k rotational disks on these nodes 
Initially, cassandra data was configured as raid10 but that will be changed to 
raid0

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   
  

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207420#comment-14207420
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

I'm not sure the disk can get full, they have lots of disk space
Attaching the df -h output soon

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   
|   

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207423#comment-14207423
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

Here is the disk usage on the three nodes:

{noformat}
alln01-ats-cas1 
Filesystem Size Used Avail Use% Mounted on 
/dev/mapper/vg01-lv_root 50G 9.3G 38G 20% / 
tmpfs 95G 0 95G 0% /dev/shm 
/dev/sda1 485M 40M 421M 9% /boot 
/dev/mapper/vg01-lv_home 20G 172M 19G 1% /home 
/dev/mapper/vg01-lv_tmp 20G 173M 19G 1% /tmp 
/dev/mapper/vg01-lv_var 20G 19G 17M 100% /var 
/dev/mapper/vg01-lv_varlog 20G 6.5G 13G 35% /var/log 
/dev/mapper/vg01-lv_vartmp 20G 172M 19G 1% /var/tmp 
/dev/mapper/vgcommit-lvcommit 825G 4.2G 779G 1% /cassandra/commitlog 
/dev/mapper/vgcache-lvcache 825G 290G 493G 37% /cassandra/saved_caches 
/dev/md1 4.1T 415G 3.5T 11% /cassandra/data 
alln01-ats-cas2 
Filesystem Size Used Avail Use% Mounted on 
/dev/mapper/vg01-lv_root 50G 4.6G 43G 10% / 
tmpfs 95G 0 95G 0% /dev/shm 
/dev/sda1 485M 40M 421M 9% /boot 
/dev/mapper/vg01-lv_home 20G 172M 19G 1% /home 
/dev/mapper/vg01-lv_tmp 20G 173M 19G 1% /tmp 
/dev/mapper/vg01-lv_var 20G 736M 18G 4% /var 
/dev/mapper/vg01-lv_varlog 20G 9.8G 9.0G 53% /var/log 
/dev/mapper/vg01-lv_vartmp 20G 172M 19G 1% /var/tmp 
/dev/mapper/vgcommit-lvcommit 825G 5.0G 778G 1% /cassandra/commitlog 
/dev/mapper/vgcache-lvcache 825G 271G 512G 35% /cassandra/saved_caches 
/dev/md1 4.1T 366G 3.5T 10% /cassandra/data 
alln01-ats-cas3 
Filesystem Size Used Avail Use% Mounted on 
/dev/mapper/vg01-lv_root 50G 5.1G 42G 11% / 
tmpfs 95G 0 95G 0% /dev/shm 
/dev/sda1 485M 40M 421M 9% /boot 
/dev/mapper/vg01-lv_home 20G 172M 19G 1% /home 
/dev/mapper/vg01-lv_tmp 20G 198M 19G 2% /tmp 
/dev/mapper/vg01-lv_var 20G 14G 5.6G 71% /var 
/dev/mapper/vg01-lv_varlog 20G 16G 3.4G 82% /var/log 
/dev/mapper/vg01-lv_vartmp 20G 172M 19G 1% /var/tmp 
/dev/mapper/vgcommit-lvcommit 825G 6.8G 776G 1% /cassandra/commitlog 
/dev/mapper/vgcache-lvcache 825G 264G 519G 34% /cassandra/saved_caches 
/dev/md1 4.1T 334G 3.5T 9% /cassandra/data
{noformat}

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 

[jira] [Commented] (CASSANDRA-8295) Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex

2014-11-11 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207611#comment-14207611
 ] 

Jose Martinez Poblete commented on CASSANDRA-8295:
--

There's been 500 - 700Gb on free space without further cleanup... other than 
restart.   Upgraded to DSE 4.5.3 C* 2.0.11 to avoid having to set limit on 
commit log size 

 Cassandra runs OOM @ java.util.concurrent.ConcurrentSkipListMap$HeadIndex
 -

 Key: CASSANDRA-8295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8295
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.3 Cassandra 2.0.11.82
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas3.cassandra.yaml, output.tgz, system.tgz, 
 system.tgz.1, system.tgz.2, system.tgz.3


 Customer runs a 3 node cluster 
 Their dataset is less than 1Tb and during data load, one of the nodes enter a 
 GC death spiral:
 {noformat}
  INFO [ScheduledTasks:1] 2014-11-07 23:31:08,094 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 3348 ms for 2 collections, 1658268944 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,486 GCInspector.java (line 116) 
 GC for ParNew: 442 ms for 2 collections, 6079570032 used; max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:40:58,487 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 7351 ms for 2 collections, 6084678280 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:01,836 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 603 ms for 1 collections, 7132546096 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:09,626 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 761 ms for 1 collections, 7286946984 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:15,265 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 703 ms for 1 collections, 7251213520 used; max is 
 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:25,027 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 1205 ms for 1 collections, 6507586104 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:41,374 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 13835 ms for 3 collections, 6514187192 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-07 23:41:54,137 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 6834 ms for 2 collections, 6521656200 used; max 
 is 8375238656
 ...
  INFO [ScheduledTasks:1] 2014-11-08 12:13:11,086 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 43967 ms for 2 collections, 8368777672 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:14,151 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 63968 ms for 3 collections, 8369623824 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:14:55,643 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 41307 ms for 2 collections, 8370115376 used; max 
 is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 12:20:06,197 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 309634 ms for 15 collections, 8374994928 used; 
 max is 8375238656
  INFO [ScheduledTasks:1] 2014-11-08 13:07:33,617 GCInspector.java (line 116) 
 GC for ConcurrentMarkSweep: 2681100 ms for 143 collections, 8347631560 used; 
 max is 8375238656
 {noformat} 
 Their application waits 1 minute before a retry when a timeout is returned
 This is what we find on their heapdumps:
 {noformat}
 Class Name
   
   
| Shallow Heap 
 | Retained Heap | Percentage
 -
 org.apache.cassandra.db.Memtable @ 0x773f52f80
   
   
|   72 
 | 8,086,073,504 | 96.66%
 |- java.util.concurrent.ConcurrentSkipListMap @ 0x724508fe8   
   
   

[jira] [Created] (CASSANDRA-8269) Large number of system hints other CF's cause heap to fill and run OOM

2014-11-06 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-8269:


 Summary: Large number of system hints  other CF's cause heap to 
fill and run OOM
 Key: CASSANDRA-8269
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8269
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.0 with Apache Cassandra 2.0.5
Reporter: Jose Martinez Poblete
 Attachments: alln01-ats-cas2-java_1414110068_Leak_Suspects.zip, 
system.log

A 3 node cluster with large amount of sstables for system.hints and other 3 
user tables was coming down regularly with OOM on system log showing up the 
following:

{noformat}
ERROR [OptionalTasks:1] 2014-10-23 18:51:29,052 CassandraDaemon.java (line 199) 
Exception in thread Thread[OptionalTasks:1,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.sstable.IndexHelper$IndexInfo.deserialize(IndexHelper.java:187)
at 
org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:122)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.computeNext(SSTableScanner.java:229)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.computeNext(SSTableScanner.java:203)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.io.sstable.SSTableScanner.hasNext(SSTableScanner.java:183)
at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.init(MergeIterator.java:87)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
at 
org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:74)
at 
org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1586)
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1709)
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1643)
at 
org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:513)
at 
org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:91)
at 
org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:173)
at 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:75)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{noformat}

A heapdump would show the following:

{noformat}
Class Name  
  | Shallow Heap | Retained Heap | Percentage
--
java.lang.Thread @ 0x67b292138  OptionalTasks:1 Thread  
  |  104 | 4,901,485,768 | 58.60%
|- org.apache.cassandra.utils.MergeIterator$ManyToOne @ 0x7b9dc4ad8 
  |   40 | 4,900,817,312 | 58.59%
|  |- java.util.ArrayList @ 0x6f05f15f0 
  |   24 |   403,635,848 |  4.83%
|  |- org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator @ 
0x7b5fe7078|   40 |29,669,312 |  0.35%
|  |  |- org.apache.cassandra.db.RowIndexEntry$IndexedEntry @ 0x7b7caaa28   
  |   32 |26,770,264 |  0.32%
|  |  |- org.apache.cassandra.db.RowIndexEntry$IndexedEntry @ 0x7b7f6e670   
  |   32 | 2,898,864 |  0.03%
|  |  |  '- java.util.ArrayList @ 0x7b7caaae0   
  |   24 | 2,898,832 |  0.03%
|  |  | '- java.lang.Object[12283] @ 0x7b7caaaf8
  |   49,152 | 2,898,808 |  0.03%
|  |  ||- org.apache.cassandra.io.sstable.IndexHelper$IndexInfo @ 
0x7b7cb6af8 |   40 |   232 |  0.00%
|  |  ||- org.apache.cassandra.io.sstable.IndexHelper$IndexInfo @ 
0x7b7cb6be0 |  

[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-10-30 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Description: 
Customer has a 3 node cluster with 500Mb data on each node

{noformat}
[cassandra@nbcqa-chc-a02 ~]$ nodetool status
Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: CH2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
   Rack
UN  162.150.4.234  255.26 MB  256 33.2%  
4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
UN  162.150.4.235  318.37 MB  256 32.6%  
3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
UN  162.150.4.167  243.7 MB   256 34.2%  
5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
[cassandra@nbcqa-chc-a02 ~]$
{noformat}

After we run repair command, system runs into OOM after some 45 minutes
Nothing else is running

{noformat}
[cassandra@nbcqa-chc-a02 ~]$ date
Fri Sep 19 15:55:33 UTC 2014
[cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
-9220354588320251873
Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
WARNING: Failed to check the connection: java.net.SocketTimeoutException: Read 
timed out
{noformat}

Here is when we run OOM

{noformat}
ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:28914,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
at 
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
at 
org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1363)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1927)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
{noformat}

Cassandra process pegs 1 of the 8 CPU's 100% 

{noformat}
top - 16:50:07 up 11 days,  2:01,  2 users,  load average: 0.54, 0.60, 0.65
Tasks: 175 total,   1 running, 174 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.7%us,  0.3%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.3%us,  0.3%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st
Cpu6  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16332056k total, 16167212k used,   164844k free,   149956k buffers
Swap:0k total,0k used,0k free,  8360056k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2161 cassandr  20   0 11.5g 6.9g 227m S 107.8 44.0 281:59.49 java
 9942 root  20   0  106m 2320 1344 S  1.0  0.0   0:00.03 dhclient-script
28969 opscente  20   0 4479m 188m  

[jira] [Commented] (CASSANDRA-8015) nodetool exception for users with read only permissions on jmx authentication

2014-10-22 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14179997#comment-14179997
 ] 

Jose Martinez Poblete commented on CASSANDRA-8015:
--

So is it because nodetool status is invoking some function which is 
considered readWrite that we are getting this fail ?
Any way we can get around that ?

 nodetool exception for users with read only permissions on jmx authentication 
 --

 Key: CASSANDRA-8015
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8015
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.8.39
Reporter: Jose Martinez Poblete
Assignee: Joshua McKenzie
Priority: Minor

 nodetool will throw exception for a read only user when JMX authentication is 
 enabled.
 {noformat}
 [automaton@i-0212b8098 ~]$ nodetool -u jose -pw JoseManuel status
 Exception in thread main java.lang.SecurityException: Access denied! 
 Invalid access level for requested MBeanServer operation.
 at 
 com.sun.jmx.remote.security.MBeanServerFileAccessController.checkAccess(MBeanServerFileAccessController.java:344)
 at 
 com.sun.jmx.remote.security.MBeanServerFileAccessController.checkWrite(MBeanServerFileAccessController.java:240)
 at 
 com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:466)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
 at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
 at java.security.AccessController.doPrivileged(Native Method)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1427)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
 at sun.rmi.transport.Transport$1.run(Transport.java:177)
 at sun.rmi.transport.Transport$1.run(Transport.java:174)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
 at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 at 
 sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
 at 
 sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
 at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
 at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
 Source)
 at 
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
 at 
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
 at com.sun.proxy.$Proxy0.effectiveOwnership(Unknown Source)
 at 
 org.apache.cassandra.tools.NodeProbe.effectiveOwnership(NodeProbe.java:335)
 at 
 org.apache.cassandra.tools.NodeCmd$ClusterStatus.print(NodeCmd.java:480)
 at 
 org.apache.cassandra.tools.NodeCmd.printClusterStatus(NodeCmd.java:590)
 at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1263)
 [automaton@i-0212b8098 ~]$ dse -v
 4.5.1
 [automaton@i-0212b8098 ~]$ cqlsh -u jose -p JoseManuel 
 Connected to Spark at localhost:9160.
 [cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
 Use HELP for help.
 cqlsh exit;
 [automaton@i-0212b8098 ~]$ 
 {noformat}
 Nodetool runs fine for cassandra user:
 {noformat}
 [automaton@i-0212b8098 ~]$ nodetool -u cassandra -pw cassandra status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Cassandra
 =

[jira] [Created] (CASSANDRA-8015) nodetool exception for users with read only permissions on jmx authentication

2014-09-29 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-8015:


 Summary: nodetool exception for users with read only permissions 
on jmx authentication 
 Key: CASSANDRA-8015
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8015
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.8.39
Reporter: Jose Martinez Poblete


nodetool will throw exception for a read only user when JMX authentication is 
enabled.

{noformat}
[automaton@i-0212b8098 ~]$ nodetool -u jose -pw JoseManuel status
Exception in thread main java.lang.SecurityException: Access denied! Invalid 
access level for requested MBeanServer operation.
at 
com.sun.jmx.remote.security.MBeanServerFileAccessController.checkAccess(MBeanServerFileAccessController.java:344)
at 
com.sun.jmx.remote.security.MBeanServerFileAccessController.checkWrite(MBeanServerFileAccessController.java:240)
at 
com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:466)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1427)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
at 
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
Source)
at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
at com.sun.proxy.$Proxy0.effectiveOwnership(Unknown Source)
at 
org.apache.cassandra.tools.NodeProbe.effectiveOwnership(NodeProbe.java:335)
at 
org.apache.cassandra.tools.NodeCmd$ClusterStatus.print(NodeCmd.java:480)
at 
org.apache.cassandra.tools.NodeCmd.printClusterStatus(NodeCmd.java:590)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1263)
[automaton@i-0212b8098 ~]$ dse -v
4.5.1
[automaton@i-0212b8098 ~]$ cqlsh -u jose -p JoseManuel 
Connected to Spark at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh exit;
[automaton@i-0212b8098 ~]$ 
{noformat}


Nodetool runs fine for cassandra user:

{noformat}
[automaton@i-0212b8098 ~]$ nodetool -u cassandra -pw cassandra status
Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: Cassandra
=
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Owns   Host ID   
TokenRack
UN  10.240.11.164  771.93 KB  100.0%  ae672795-bd73-4f53-a371-1a35c8df28a1  
-9223372036854775808 rack1
[automaton@i-0212b8098 ~]$
{noformat}

JMX authentication is enabled as described [here | 
https://support.datastax.com/entries/43692547-Step-by-step-instructions-for-securing-JMX-authentication-for-nodetool-utility-OpsCenter-and-JConsol]




[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-22 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Attachment: nbcqa-chc-a03_systemlog.tar.Z
nbcqa-chc-a01_systemlog.tar.Z

The rest of the system logs

 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
 {noformat}
  INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
 4.5.0
  INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop 
 version: 1.0.4.13
  INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
 0.12.0.3
  INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
 0.10.1
  INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
 4.6.0.2.4
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
 1.4.4.14.1
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout 
 version: 0.8
  INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
 version: 3.0.2
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
 0.9.1
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
 0.9.1.1
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
 vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap 
 size: 6316621824/6316621824
 {noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Herer is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at 

[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-22 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Attachment: (was: nbcqa-chc-a01_systemlog.tar.Z)

 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
 {noformat}
  INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
 4.5.0
  INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop 
 version: 1.0.4.13
  INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
 0.12.0.3
  INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
 0.10.1
  INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
 4.6.0.2.4
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
 1.4.4.14.1
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout 
 version: 0.8
  INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
 version: 3.0.2
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
 0.9.1
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
 0.9.1.1
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
 vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap 
 size: 6316621824/6316621824
 {noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Herer is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
 at 
 

[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-22 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Attachment: (was: nbcqa-chc-a03_systemlog.tar.Z)

 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
 {noformat}
  INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
 4.5.0
  INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop 
 version: 1.0.4.13
  INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
 0.12.0.3
  INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
 0.10.1
  INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
 4.6.0.2.4
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
 1.4.4.14.1
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout 
 version: 0.8
  INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
 version: 3.0.2
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
 0.9.1
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
 0.9.1.1
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
 vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap 
 size: 6316621824/6316621824
 {noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Herer is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
 at 
 

[jira] [Issue Comment Deleted] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-22 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Comment: was deleted

(was: The rest of the system logs)

 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
 {noformat}
  INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
 4.5.0
  INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop 
 version: 1.0.4.13
  INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
 0.12.0.3
  INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
 0.10.1
  INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
 4.6.0.2.4
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
 1.4.4.14.1
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout 
 version: 0.8
  INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
 version: 3.0.2
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
 0.9.1
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
 0.9.1.1
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
 vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap 
 size: 6316621824/6316621824
 {noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Herer is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
 at 
 

[jira] [Updated] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-21 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7983:
-
Attachment: nbcqa-chc-a01_systemlog.tar.Z
nbcqa-chc-a03_systemlog.tar.Z

system logs from other 2 nodes

 nodetool repair triggers OOM
 

 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
 {noformat}
  INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
 4.5.0
  INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop 
 version: 1.0.4.13
  INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
 0.12.0.3
  INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
 0.10.1
  INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
 4.6.0.2.4
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
 1.4.4.14.1
  INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout 
 version: 0.8
  INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
 version: 3.0.2
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
 0.9.1
  INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
 0.9.1.1
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
 vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
  INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap 
 size: 6316621824/6316621824
 {noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, nbcqa-chc-a01_systemlog.tar.Z, 
 nbcqa-chc-a03_systemlog.tar.Z, system.log


 Customer has a 3 node cluster with 500Mb data on each node
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: CH2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  162.150.4.234  255.26 MB  256 33.2%  
 4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
 UN  162.150.4.235  318.37 MB  256 32.6%  
 3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
 UN  162.150.4.167  243.7 MB   256 34.2%  
 5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
 [cassandra@nbcqa-chc-a02 ~]$
 {noformat}
 After we run repair command, system runs into OOM after some 45 minutes
 Nothing else is running
 {noformat}
 [cassandra@nbcqa-chc-a02 ~]$ date
 Fri Sep 19 15:55:33 UTC 2014
 [cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
 -9220354588320251873
 Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
 WARNING: Failed to check the connection: java.net.SocketTimeoutException: 
 Read timed out
 {noformat}
 Herer is when we run OOM
 {noformat}
 ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 
 199) Exception in thread Thread[ReadStage:28914,5,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
 at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
 at 
 org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
 at 

[jira] [Created] (CASSANDRA-7983) nodetool repair triggers OOM

2014-09-20 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-7983:


 Summary: nodetool repair triggers OOM
 Key: CASSANDRA-7983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7983
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment:  
{noformat}
 INFO [main] 2014-09-16 14:23:14,621 DseDaemon.java (line 368) DSE version: 
4.5.0
 INFO [main] 2014-09-16 14:23:14,622 DseDaemon.java (line 369) Hadoop version: 
1.0.4.13
 INFO [main] 2014-09-16 14:23:14,627 DseDaemon.java (line 370) Hive version: 
0.12.0.3
 INFO [main] 2014-09-16 14:23:14,628 DseDaemon.java (line 371) Pig version: 
0.10.1
 INFO [main] 2014-09-16 14:23:14,629 DseDaemon.java (line 372) Solr version: 
4.6.0.2.4
 INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 373) Sqoop version: 
1.4.4.14.1
 INFO [main] 2014-09-16 14:23:14,630 DseDaemon.java (line 374) Mahout version: 
0.8
 INFO [main] 2014-09-16 14:23:14,631 DseDaemon.java (line 375) Appender 
version: 3.0.2
 INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 376) Spark version: 
0.9.1
 INFO [main] 2014-09-16 14:23:14,632 DseDaemon.java (line 377) Shark version: 
0.9.1.1
 INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 160) JVM 
vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51
 INFO [main] 2014-09-16 14:23:20,270 CassandraDaemon.java (line 188) Heap size: 
6316621824/6316621824
{noformat}
Reporter: Jose Martinez Poblete
 Attachments: gc.log.0, system.log

Customer has a 3 node cluster with 500Mb data on each node

{noformat}
[cassandra@nbcqa-chc-a02 ~]$ nodetool status
Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: CH2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
   Rack
UN  162.150.4.234  255.26 MB  256 33.2%  
4ad1b6a8-8759-4920-b54a-f059126900df  RAC1
UN  162.150.4.235  318.37 MB  256 32.6%  
3eb0ec58-4b81-442e-bee5-4c91da447f38  RAC1
UN  162.150.4.167  243.7 MB   256 34.2%  
5b2c1900-bf03-41c1-bb4e-82df1655b8d8  RAC1
[cassandra@nbcqa-chc-a02 ~]$
{noformat}

After we run repair command, system runs into OOM after some 45 minutes
Nothing else is running

{noformat}
[cassandra@nbcqa-chc-a02 ~]$ date
Fri Sep 19 15:55:33 UTC 2014
[cassandra@nbcqa-chc-a02 ~]$ nodetool repair -st -9220354588320251877 -et 
-9220354588320251873
Sep 19, 2014 4:06:08 PM ClientCommunicatorAdmin Checker-run
WARNING: Failed to check the connection: java.net.SocketTimeoutException: Read 
timed out
{noformat}

Herer is when we run OOM

{noformat}
ERROR [ReadStage:28914] 2014-09-19 16:34:50,381 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:28914,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:69)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
at 
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
at 
org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1363)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1927)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at 

[jira] [Created] (CASSANDRA-7444) Performance drops when creating large amount of tables

2014-06-24 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-7444:


 Summary: Performance drops when creating large amount of tables 
 Key: CASSANDRA-7444
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7444
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 3.1.8 | Cassandra 1.2.15.1 | CQL spec 3.0.0 | 
Thrift protocol 19.36.2][cqlsh 4.1.1 | Cassandra 2.0.7.31 | CQL spec 3.1.1 | 
Thrift protocol 19.39.0]
Reporter: Jose Martinez Poblete


We are creating 4000 tables from a script and using cqlsh to create the tables. 
As the tables are being created, the time taken grows exponentially and it 
becomes very slow and takes a lot of time.

We read a file get the keyspace append a random number and then create keyspace 
with this new name example Airplane_12345678, Airplane_123575849... then fed 
into cqlsh via script

Similarly each table is created via script use Airplane_12345678; create 
table1...table25 , then use Airplane_123575849; create table1...create table25

It is all done in singleton fashion, doing one after the other in a loop.

We tested using the following bash script

{noformat}
#!/bin/bash
SEED=0
ITERATIONS=20
while [ ${SEED} -lt ${ITERATIONS} ]; do
   COUNT=0
   KEYSPACE=t10789_${SEED}
   echo CREATE KEYSPACE ${KEYSPACE} WITH replication = { 'class': 
'NetworkTopologyStrategy', 'Cassandra': '1' };   ${KEYSPACE}.ddl
   echo USE ${KEYSPACE};  ${KEYSPACE}.ddl
   while [ ${COUNT} -lt 25 ]; do
  echo CREATE TABLE user_colors${COUNT} (user_id int PRIMARY KEY, colors 
listascii );  ${KEYSPACE}.ddl
  ((COUNT++))
   done 
   ((SEED++))
   time cat ${KEYSPACE}.ddl | cqlsh
   if [ $? -gt 0 ]; then
  echo [ERROR] Failure at ${KEYSPACE}
  exit 1
   else
  echo [OK]Created ${KEYSPACE}
   fi
   echo ===
   sleep 3
done
#EOF
{noformat}

The timing we got on an otherwise idle system were inconsistent

{noformat}
real0m42.649s
user0m0.332s
sys 0m0.092s
[OK]Created t10789_0
===

real1m22.211s
user0m0.332s
sys 0m0.096s
[OK]Created t10789_1
===

real2m45.907s
user0m0.304s
sys 0m0.124s
[OK]Created t10789_2
===

real3m24.098s
user0m0.340s
sys 0m0.108s
[OK]Created t10789_3
===

real2m38.930s
user0m0.324s
sys 0m0.116s
[OK]Created t10789_4
===

real3m4.186s
user0m0.336s
sys 0m0.104s
[OK]Created t10789_5
===

real2m55.391s
user0m0.344s
sys 0m0.092s
[OK]Created t10789_6
===

real2m14.290s
user0m0.328s
sys 0m0.108s
[OK]Created t10789_7
===

real2m44.880s
user0m0.344s
sys 0m0.092s
[OK]Created t10789_8
===

real1m52.785s
user0m0.336s
sys 0m0.128s
[OK]Created t10789_9
===

real1m18.404s
user0m0.344s
sys 0m0.108s
[OK]Created t10789_10
===

real2m20.681s
user0m0.348s
sys 0m0.104s
[OK]Created t10789_11
===

real1m11.860s
user0m0.332s
sys 0m0.096s
[OK]Created t10789_12
===

real1m37.887s
user0m0.324s
sys 0m0.100s
[OK]Created t10789_13
===

real1m31.616s
user0m0.316s
sys 0m0.132s
[OK]Created t10789_14
===

real1m12.103s
user0m0.360s
sys 0m0.088s
[OK]Created t10789_15
===

real0m36.378s
user0m0.340s
sys 0m0.092s
[OK]Created t10789_16
===

real0m40.883s
user0m0.352s
sys 0m0.096s
[OK]Created t10789_17
===

real0m40.661s
user0m0.332s
sys 0m0.096s
[OK]Created t10789_18
===

real0m44.943s
user0m0.324s
sys 0m0.104s
[OK]Created t10789_19
===
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7407) COPY command does not work properly with collections causing failure to import data

2014-06-17 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-7407:


 Summary: COPY command does not work properly with collections 
causing failure to import data
 Key: CASSANDRA-7407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: cqlsh 4.1.1, 
Cassandra 2.0.7.31,
CQL spec 3.1.1,
Thrift protocol 19.39.0
Reporter: Jose Martinez Poblete


The COPY command does not properly format collections in the output CSV - to be 
able to re-import the data.

Here is how you can replicate the problem:

{noformat}
CREATE TABLE user_colors ( 
user_id int PRIMARY KEY, 
colors listascii 
);

UPDATE user_colors SET colors = ['red','blue'] WHERE user_id=5; 
UPDATE user_colors SET colors = ['purple','yellow'] WHERE user_id=6; 
UPDATE user_colors SET colors = ['black''] WHERE user_id=7;

COPY user_colors (user_id, colors) TO 'output.csv';

CREATE TABLE user_colors2 ( 
user_id int PRIMARY KEY, 
colors listascii 
);

COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
Bad Request: line 1:68 no viable alternative at input ']'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.007 seconds.
{noformat}

The CSV file seems to be malformed
- The single quotes within the collection are missing
- The double quotes for collection on user_id=7 are missing and causing COPY to 
fail.
{noformat}
5,[red, blue]
7,[black]
6,[purple, yellow]
{noformat}

Should be like this
{noformat}
5,['red', 'blue']
7,['black']
6,['purple', 'yellow']
{noformat}

Once the file is changed, the import works
{noformat}
COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
3 rows imported in 0.012 seconds.
SELECT * FROM user_colors2;

 user_id | colors
-+--
   5 |  [red, blue]
   7 |  [black]
   6 | [purple, yellow]

(3 rows)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7407) COPY command does not work properly with collections causing failure to import data

2014-06-17 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034024#comment-14034024
 ] 

Jose Martinez Poblete commented on CASSANDRA-7407:
--

Yes, the problem is that COPY command is creating a malformed CSV file.   
That is what needs to be fixed.  Sorry for the confusion!

 COPY command does not work properly with collections causing failure to 
 import data
 ---

 Key: CASSANDRA-7407
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7407
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: cqlsh 4.1.1, 
 Cassandra 2.0.7.31,
 CQL spec 3.1.1,
 Thrift protocol 19.39.0
Reporter: Jose Martinez Poblete
  Labels: patch

 The COPY command does not properly format collections in the output CSV - to 
 be able to re-import the data.
 Here is how you can replicate the problem:
 {noformat}
 CREATE TABLE user_colors ( 
 user_id int PRIMARY KEY, 
 colors listascii 
 );
 UPDATE user_colors SET colors = ['red','blue'] WHERE user_id=5; 
 UPDATE user_colors SET colors = ['purple','yellow'] WHERE user_id=6; 
 UPDATE user_colors SET colors = ['black''] WHERE user_id=7;
 COPY user_colors (user_id, colors) TO 'output.csv';
 CREATE TABLE user_colors2 ( 
 user_id int PRIMARY KEY, 
 colors listascii 
 );
 COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
 Bad Request: line 1:68 no viable alternative at input ']'
 Aborting import at record #0 (line 1). Previously-inserted values still 
 present.
 0 rows imported in 0.007 seconds.
 {noformat}
 The CSV file seems to be malformed
 - The single quotes within the collection are missing
 - The double quotes for collection on user_id=7 are missing and causing COPY 
 to fail.
 {noformat}
 5,[red, blue]
 7,[black]
 6,[purple, yellow]
 {noformat}
 Should be like this
 {noformat}
 5,['red', 'blue']
 7,['black']
 6,['purple', 'yellow']
 {noformat}
 Once the file is changed, the import works
 {noformat}
 COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
 3 rows imported in 0.012 seconds.
 SELECT * FROM user_colors2;
  user_id | colors
 -+--
5 |  [red, blue]
7 |  [black]
6 | [purple, yellow]
 (3 rows)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7380) Native protocol needs keepalive, we should add it

2014-06-11 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-7380:


 Summary: Native protocol needs keepalive, we should add it
 Key: CASSANDRA-7380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7380
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 1.2, 2.0
Reporter: Jose Martinez Poblete


On clients connecting to C* 1.2.15 using native protocol. We see that when the 
client is bounced, the old connection is not going away

On Thrift, there is the rpc_timeout but there is no similar feature for the 
native protocol



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7380) Native protocol needs keepalive, we should add it

2014-06-11 Thread Jose Martinez Poblete (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Martinez Poblete updated CASSANDRA-7380:
-

Description: 
On clients connecting to C* 1.2.15 using native protocol. We see that when the 
client is bounced, the old connection is not going away

On Thrift, there is the rpc_keepalive but there is no similar feature for the 
native protocol

  was:
On clients connecting to C* 1.2.15 using native protocol. We see that when the 
client is bounced, the old connection is not going away

On Thrift, there is the rpc_timeout but there is no similar feature for the 
native protocol


 Native protocol needs keepalive, we should add it
 -

 Key: CASSANDRA-7380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7380
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 1.2, 2.0
Reporter: Jose Martinez Poblete

 On clients connecting to C* 1.2.15 using native protocol. We see that when 
 the client is bounced, the old connection is not going away
 On Thrift, there is the rpc_keepalive but there is no similar feature for the 
 native protocol



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7212) Allow to switch user within CQLSH session

2014-05-12 Thread Jose Martinez Poblete (JIRA)
Jose Martinez Poblete created CASSANDRA-7212:


 Summary: Allow to switch user within CQLSH session
 Key: CASSANDRA-7212
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7212
 Project: Cassandra
  Issue Type: Improvement
  Components: API
 Environment: [cqlsh 4.1.1 | Cassandra 2.0.7.31 | CQL spec 3.1.1 | 
Thrift protocol 19.39.0]
Reporter: Jose Martinez Poblete


Once a user is logged into CQLSH, it is not possible to switch to another user  
without exiting and relaunch
This is a feature offered in postgres and probably other databases:

http://secure.encivasolutions.com/knowledgebase.php?action=displayarticleid=1126
 

Perhaps this could be implemented on CQLSH as part of the USE directive:

USE Keyspace [USER] [password] 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7117) cqlsh should return a non-zero error code if a query fails

2014-04-30 Thread Jose Martinez Poblete (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985737#comment-13985737
 ] 

Jose Martinez Poblete commented on CASSANDRA-7117:
--

This is an illustration of the described behavior

{noformat}
ubuntu@ip-10-182-163-57:~$ echo select count(*) from word limit 100;  
/tmp/bad.sql
ubuntu@ip-10-182-163-57:~$ cqlsh -k Keyspace1 -f /tmp/bad.sql
/tmp/bad.sql:2:Bad Request: unconfigured columnfamily word
ubuntu@ip-10-182-163-57:~$ echo $?
0
ubuntu@ip-10-182-163-57:~$ echo select count(*) from words limit 100;  
/tmp/good.sql
ubuntu@ip-10-182-163-57:~$ cqlsh -k Keyspace1 -f /tmp/good.sql

 count

 650722

(1 rows)

ubuntu@ip-10-182-163-57:~$ echo $?
0
ubuntu@ip-10-182-163-57:~$ 
{noformat}

 cqlsh should return a non-zero error code if a query fails
 --

 Key: CASSANDRA-7117
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7117
 Project: Cassandra
  Issue Type: Improvement
Reporter: J.B. Langston
Priority: Minor

 cqlsh should return a non-zero error code when a query in a file or piped 
 stdin fails.  This is so that shell scripts to determine if a cql script 
 failed or succeeded.



--
This message was sent by Atlassian JIRA
(v6.2#6252)