Git Push Summary

2013-12-16 Thread slebresne
Updated Tags:  refs/tags/1.2.13-tentative [created] 4be9e6720


Git Push Summary

2013-12-16 Thread slebresne
Updated Tags:  refs/tags/1.2.13-tentative [deleted] 6ab82a469


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-12-16 Thread Akshay DM (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848964#comment-13848964
 ] 

Akshay DM commented on CASSANDRA-6151:
--

@Shridhar The patch seems to be working for 1.2.12 too. Thanks a lot... 

> CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
> 
>
> Key: CASSANDRA-6151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Russell Alexander Spitzer
>Assignee: Alex Liu
>Priority: Minor
> Attachments: 6151-1.2-branch.txt, 6151-v2-1.2-branch.txt, 
> 6151-v3-1.2-branch.txt, 6151-v4-1.2.10-branch.txt
>
>
> From 
> http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
> The user was attempting to load a single partition using a where clause in a 
> pig load statement. 
> CQL Table
> {code}
> CREATE table data (
>   occurday  text,
>   seqnumber int,
>   occurtimems bigint,
>   unique bigint,
>   fields map,
>   primary key ((occurday, seqnumber), occurtimems, unique)
> )
> {code}
> Pig Load statement Query
> {code}
> data = LOAD 
> 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
>  USING CqlStorage();
> {code}
> This results in an exception when processed by the the CqlPagingRecordReader 
> which attempts to page this query even though it contains at most one 
> partition key. This leads to an invalid CQL statement. 
> CqlPagingRecordReader Query
> {code}
> SELECT * FROM "data" WHERE token("occurday","seqnumber") > ? AND
> token("occurday","seqnumber") <= ? AND occurday='A Great Day' 
> AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
> {code}
> Exception
> {code}
>  InvalidRequestException(why:occurday cannot be restricted by more than one 
> relation if it includes an Equal)
> {code}
> I'm not sure it is worth the special case but, a modification to not use the 
> paging record reader when the entire partition key is specified would solve 
> this issue. 
> h3. Solution
>  If it have EQUAL clauses for all the partitioning keys, we use Query 
> {code}
>   SELECT * FROM "data" 
>   WHERE occurday='A Great Day' 
>AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
> {code}
> instead of 
> {code}
>   SELECT * FROM "data" 
>   WHERE token("occurday","seqnumber") > ? 
>AND token("occurday","seqnumber") <= ? 
>AND occurday='A Great Day' 
>AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
> {code}
> The base line implementation is to retrieve all data of all rows around the 
> ring. This new feature is to retrieve all data of a wide row. It's a one 
> level lower than the base line. It helps for the use case where user is only 
> interested in a specific wide row, so the user doesn't spend whole job to 
> retrieve all the rows around the ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


git commit: Slightly improved message when parsing properties for DDL queries

2013-12-16 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.2 4be9e6720 -> 54a1955d2


Slightly improved message when parsing properties for DDL queries

patch by boneill42; reviewed by slebresne for CASSANDRA-6453


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54a1955d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54a1955d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54a1955d

Branch: refs/heads/cassandra-1.2
Commit: 54a1955d254bfc89e48389d5d0d94c79d027d470
Parents: 4be9e67
Author: Sylvain Lebresne 
Authored: Mon Dec 16 10:53:22 2013 +0100
Committer: Sylvain Lebresne 
Committed: Mon Dec 16 10:53:22 2013 +0100

--
 CHANGES.txt  |  3 +++
 src/java/org/apache/cassandra/cql3/Cql.g | 10 --
 2 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54a1955d/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b55393b..4816d70 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,3 +1,6 @@
+1.2.14
+ * Improved error message on bad properties in DDL queries (CASSANDRA-6453)
+
 1.2.13
  * Randomize batchlog candidates selection (CASSANDRA-6481)
  * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54a1955d/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index 7101c71..ea6844f 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -93,12 +93,18 @@ options {
 
 if (!(entry.left instanceof Constants.Literal))
 {
-addRecognitionError("Invalid property name: " + entry.left);
+String msg = "Invalid property name: " + entry.left;
+if (entry.left instanceof AbstractMarker.Raw)
+msg += " (bind variables are not supported in DDL 
queries)";
+addRecognitionError(msg);
 break;
 }
 if (!(entry.right instanceof Constants.Literal))
 {
-addRecognitionError("Invalid property value: " + entry.right);
+String msg = "Invalid property value: " + entry.right + " for 
property: " + entry.left;
+if (entry.right instanceof AbstractMarker.Raw)
+msg += " (bind variables are not supported in DDL 
queries)";
+addRecognitionError(msg);
 break;
 }
 



[jira] [Resolved] (CASSANDRA-6453) Improve error message for invalid property values during parsing.

2013-12-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-6453.
-

Resolution: Fixed
  Reviewer: Sylvain Lebresne

I understand that Brian's underlying problem was that he wants prepared 
statements for DDL queries which we indeed don't support.

But pragmatically, as far as this ticket description and patch goes, I don't 
see the harm in committing the error message improvement. It does is nicer to 
include the name of the property for which the value is invalid irregardless of 
the bind marker problem. Besides, Brian's confusion suggests that maybe an 
error message that explicitly indicate that bind markers are not supported 
would help too. So committed the patch with a slight specialization in the case 
of bind markers.


> Improve error message for invalid property values during parsing.
> -
>
> Key: CASSANDRA-6453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6453
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Brian ONeill
>Priority: Trivial
> Attachments: CASSANDRA-6354-patch.txt
>
>
> Trivial change to the error message returned for invalid property values.
> Previously, it would just say "Invalid property value : ?".  If you were 
> constructing a large prepared statement, with multiple question marks, it was 
> difficult to track down which one the server was complaining about.  This 
> enhancement tells you which one. =)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Reopened] (CASSANDRA-6453) Improve error message for invalid property values during parsing.

2013-12-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reopened CASSANDRA-6453:
-


> Improve error message for invalid property values during parsing.
> -
>
> Key: CASSANDRA-6453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6453
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Brian ONeill
>Priority: Trivial
> Attachments: CASSANDRA-6354-patch.txt
>
>
> Trivial change to the error message returned for invalid property values.
> Previously, it would just say "Invalid property value : ?".  If you were 
> constructing a large prepared statement, with multiple question marks, it was 
> difficult to track down which one the server was complaining about.  This 
> enhancement tells you which one. =)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6453) Improve error message for invalid property values during parsing.

2013-12-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6453:


Fix Version/s: 1.2.14

> Improve error message for invalid property values during parsing.
> -
>
> Key: CASSANDRA-6453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6453
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Brian ONeill
>Priority: Trivial
> Fix For: 1.2.14
>
> Attachments: CASSANDRA-6354-patch.txt
>
>
> Trivial change to the error message returned for invalid property values.
> Previously, it would just say "Invalid property value : ?".  If you were 
> constructing a large prepared statement, with multiple question marks, it was 
> difficult to track down which one the server was complaining about.  This 
> enhancement tells you which one. =)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Resolved] (CASSANDRA-6490) Please delete old releases from mirroring system

2013-12-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-6490.
-

Resolution: Fixed

Done ([~urandom], can you check the debian/dists/ directory and delete the 06x 
and 07x directories? I don't seem to have the right to do so and they don't 
point at anything existing anymore).

> Please delete old releases from mirroring system
> 
>
> Key: CASSANDRA-6490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6490
> Project: Cassandra
>  Issue Type: Bug
> Environment: http://www.apache.org/dist/cassandra/
>Reporter: Sebb
>Assignee: Sylvain Lebresne
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> Thanks!
> [Note that older releases are always available from the ASF archive server]
> Any links to older releases on download pages should first be adjusted to 
> point to the archive server.
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is

2013-12-16 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-6491:
---

 Summary: Timeout can send confusing information as to what their 
cause is
 Key: CASSANDRA-6491
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6491
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial


We can race between the time we "detect" a timeout and the time we build the 
actual exception, so that it's possible to have a timeout exception that 
pretends enough replica have actually acknowledged the operation, which is thus 
slightly confusing to the user as to why it got a timeout.

That kind of race is rather unlikely in a healthy environment, but 
https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least 
possible to trigger in a test environment.

Note that it's definitively not worth synchronizing to avoid that that, but it 
could maybe be simple enough to detect the race when building the exception and 
"correct" the ack count. Attaching simple patch to show what I have in mind.

Note that I don't entirely disagree that it's not "perfect", but as said above, 
proper synchronization is just not worth it and it seems to me that it's not 
worth confusing users over that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is

2013-12-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6491:


Attachment: 6491.txt

> Timeout can send confusing information as to what their cause is
> 
>
> Key: CASSANDRA-6491
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6491
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.2.14
>
> Attachments: 6491.txt
>
>
> We can race between the time we "detect" a timeout and the time we build the 
> actual exception, so that it's possible to have a timeout exception that 
> pretends enough replica have actually acknowledged the operation, which is 
> thus slightly confusing to the user as to why it got a timeout.
> That kind of race is rather unlikely in a healthy environment, but 
> https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least 
> possible to trigger in a test environment.
> Note that it's definitively not worth synchronizing to avoid that that, but 
> it could maybe be simple enough to detect the race when building the 
> exception and "correct" the ack count. Attaching simple patch to show what I 
> have in mind.
> Note that I don't entirely disagree that it's not "perfect", but as said 
> above, proper synchronization is just not worth it and it seems to me that 
> it's not worth confusing users over that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system

2013-12-16 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849187#comment-13849187
 ] 

Sebb commented on CASSANDRA-6490:
-

There is a problem with the directory protections:

drwxrwxr-x  3 eevans eevans 6 Aug 20  2012 06x
drwxrwxr-x  3 eevans eevans 6 Aug 20  2012 07x
drwxr-xr-x  3 slebresne  cassandra  6 May 27  2013 11x
drwxr-xr-x  3 slebresne  cassandra  6 Nov 25 08:11 12x
drwxr-xr-x  3 slebresne  cassandra  6 Nov 25 08:40 20x
drwxrwxr-x  3 apbackup   cassandra  6 Sep 10  2012 sid
drwxrwxr-x  3 eevans eevans 6 Aug 20  2012 unstable

Only 'sid' above is correct.

The file group should be cassandra, and files should be group-writable 
otherwise only the owner can change things.
Which is awkward when the individual is temporarily unavailable.

However please note that Infra are moving towards all projects using svnpubsub 
[1] for releases - which avoids all such issues.
I suggest you file an Infra request now so you are ready for the next release.

[1] http://www.apache.org/dev/release-publishing.html#distribution_dist

> Please delete old releases from mirroring system
> 
>
> Key: CASSANDRA-6490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6490
> Project: Cassandra
>  Issue Type: Bug
> Environment: http://www.apache.org/dist/cassandra/
>Reporter: Sebb
>Assignee: Sylvain Lebresne
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> Thanks!
> [Note that older releases are always available from the ASF archive server]
> Any links to older releases on download pages should first be adjusted to 
> point to the archive server.
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is

2013-12-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849193#comment-13849193
 ] 

Jonathan Ellis commented on CASSANDRA-6491:
---

+1

> Timeout can send confusing information as to what their cause is
> 
>
> Key: CASSANDRA-6491
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6491
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.2.14
>
> Attachments: 6491.txt
>
>
> We can race between the time we "detect" a timeout and the time we build the 
> actual exception, so that it's possible to have a timeout exception that 
> pretends enough replica have actually acknowledged the operation, which is 
> thus slightly confusing to the user as to why it got a timeout.
> That kind of race is rather unlikely in a healthy environment, but 
> https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least 
> possible to trigger in a test environment.
> Note that it's definitively not worth synchronizing to avoid that that, but 
> it could maybe be simple enough to detect the race when building the 
> exception and "correct" the ack count. Attaching simple patch to show what I 
> have in mind.
> Note that I don't entirely disagree that it's not "perfect", but as said 
> above, proper synchronization is just not worth it and it seems to me that 
> it's not worth confusing users over that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-16 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-6487:
--

Attachment: 6487_trunk.patch

> Log WARN on large batch sizes
> -
>
> Key: CASSANDRA-6487
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Patrick McFadin
>Assignee: Lyuben Todorov
>Priority: Minor
> Attachments: 6487_trunk.patch
>
>
> Large batches on a coordinator can cause a lot of node stress. I propose 
> adding a WARN log entry if batch sizes go beyond a configurable size. This 
> will give more visibility to operators on something that can happen on the 
> developer side. 
> New yaml setting with 5k default.
> {{# Log WARN on any batch size exceeding this value. 5k by default.}}
> {{# Caution should be taken on increasing the size of this threshold as it 
> can lead to node instability.}}
> {{batch_size_warn_threshold: 5k}}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system

2013-12-16 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849236#comment-13849236
 ] 

Eric Evans commented on CASSANDRA-6490:
---

bq. Done (Eric Evans, can you check the debian/dists/ directory and delete the 
06x and 07x directories? I don't seem to have the right to do so and they don't 
point at anything existing anymore).

Done.

> Please delete old releases from mirroring system
> 
>
> Key: CASSANDRA-6490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6490
> Project: Cassandra
>  Issue Type: Bug
> Environment: http://www.apache.org/dist/cassandra/
>Reporter: Sebb
>Assignee: Sylvain Lebresne
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> Thanks!
> [Note that older releases are always available from the ASF archive server]
> Any links to older releases on download pages should first be adjusted to 
> point to the archive server.
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6453) Improve error message for invalid property values during parsing.

2013-12-16 Thread Brian ONeill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849243#comment-13849243
 ] 

Brian ONeill commented on CASSANDRA-6453:
-

[~slebresne] Agreed, +1.  Five minute change to the code might save people 
hours of time.  
Thanks.

> Improve error message for invalid property values during parsing.
> -
>
> Key: CASSANDRA-6453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6453
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Brian ONeill
>Priority: Trivial
> Fix For: 1.2.14
>
> Attachments: CASSANDRA-6354-patch.txt
>
>
> Trivial change to the error message returned for invalid property values.
> Previously, it would just say "Invalid property value : ?".  If you were 
> constructing a large prepared statement, with multiple question marks, it was 
> difficult to track down which one the server was complaining about.  This 
> enhancement tells you which one. =)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system

2013-12-16 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849252#comment-13849252
 ] 

Sebb commented on CASSANDRA-6490:
-

There's still a problem with some of the protections:

drwxr-xr-x  3 slebresne  cassandra  6 May 27  2013 11x
drwxr-xr-x  3 slebresne  cassandra  6 Nov 25 08:11 12x
drwxr-xr-x  3 slebresne  cassandra  6 Nov 25 08:40 20x

These should be changed - by slebresne - to allow group-write

> Please delete old releases from mirroring system
> 
>
> Key: CASSANDRA-6490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6490
> Project: Cassandra
>  Issue Type: Bug
> Environment: http://www.apache.org/dist/cassandra/
>Reporter: Sebb
>Assignee: Sylvain Lebresne
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> Thanks!
> [Note that older releases are always available from the ASF archive server]
> Any links to older releases on download pages should first be adjusted to 
> point to the archive server.
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6378) sstableloader does not support client encryption on Cassandra 2.0

2013-12-16 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-6378:
---

Attachment: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch

> sstableloader does not support client encryption on Cassandra 2.0
> -
>
> Key: CASSANDRA-6378
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6378
> Project: Cassandra
>  Issue Type: Bug
>Reporter: David Laube
>  Labels: client, encryption, ssl, sstableloader
> Fix For: 2.0.4
>
> Attachments: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch
>
>
> We have been testing backup/restore from one ring to another and we recently 
> stumbled upon an issue with sstableloader. When client_enc_enable: true, the 
> exception below is generated. However, when client_enc_enable is set to 
> false, the sstableloader is able to get to the point where it is discovers 
> endpoints, connects to stream data, etc.
> ==BEGIN EXCEPTION==
> sstableloader --debug -d x.x.x.248,x.x.x.108,x.x.x.113 
> /tmp/import/keyspace_name/columnfamily_name
> Exception in thread "main" java.lang.RuntimeException: Could not retrieve 
> endpoint ranges:
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:226)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:68)
> Caused by: org.apache.thrift.transport.TTransportException: Frame size 
> (352518400) larger than max length (16384000)!
> at 
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137)
> at 
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_describe_partitioner(Cassandra.java:1292)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.describe_partitioner(Cassandra.java:1280)
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:199)
> ... 2 more
> ==END EXCEPTION==



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Assigned] (CASSANDRA-6378) sstableloader does not support client encryption on Cassandra 2.0

2013-12-16 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-6378:
--

Assignee: Sam Tunnicliffe

> sstableloader does not support client encryption on Cassandra 2.0
> -
>
> Key: CASSANDRA-6378
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6378
> Project: Cassandra
>  Issue Type: Bug
>Reporter: David Laube
>Assignee: Sam Tunnicliffe
>  Labels: client, encryption, ssl, sstableloader
> Fix For: 2.0.4
>
> Attachments: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch
>
>
> We have been testing backup/restore from one ring to another and we recently 
> stumbled upon an issue with sstableloader. When client_enc_enable: true, the 
> exception below is generated. However, when client_enc_enable is set to 
> false, the sstableloader is able to get to the point where it is discovers 
> endpoints, connects to stream data, etc.
> ==BEGIN EXCEPTION==
> sstableloader --debug -d x.x.x.248,x.x.x.108,x.x.x.113 
> /tmp/import/keyspace_name/columnfamily_name
> Exception in thread "main" java.lang.RuntimeException: Could not retrieve 
> endpoint ranges:
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:226)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:68)
> Caused by: org.apache.thrift.transport.TTransportException: Frame size 
> (352518400) larger than max length (16384000)!
> at 
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137)
> at 
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_describe_partitioner(Cassandra.java:1292)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.describe_partitioner(Cassandra.java:1280)
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:199)
> ... 2 more
> ==END EXCEPTION==



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849261#comment-13849261
 ] 

Michael Shuler commented on CASSANDRA-6488:
---

This introduced a failure in BootStrapperTest:

{code}
test:
 [echo] running unit tests
[mkdir] Created dir: /home/mshuler/git/cassandra/build/test/cassandra
[mkdir] Created dir: /home/mshuler/git/cassandra/build/test/output
[junit] WARNING: multiple versions of ant detected in path for junit 
[junit]  
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit]  and 
jar:file:/home/mshuler/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Testsuite: org.apache.cassandra.dht.BootStrapperTest
[junit] Tests run: 4, Failures: 1, Errors: 0, Time elapsed: 6.177 sec
[junit] 
[junit] - Standard Error -
[junit]  WARN 09:47:46,135 No host ID found, created 
9019bb70-4d6e-4cf6-b730-140ff5ae4be5 (Note: This should happen exactly once per 
node).
[junit]  WARN 09:47:46,262 Generated random token 
[d9180feb2e806704effa4024e8f4c631]. Random tokens will result in an unbalanced 
ring; see http://wiki.apache.org/cassandra/Operations
[junit] -  ---
[junit] Testcase: 
testSourceTargetComputation(org.apache.cassandra.dht.BootStrapperTest):   FAILED
[junit] expected:<1> but was:<0>
[junit] junit.framework.AssertionFailedError: expected:<1> but was:<0>
[junit] at 
org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:212)
[junit] at 
org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:173)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.dht.BootStrapperTest FAILED

BUILD FAILED
/home/mshuler/git/cassandra/build.xml:1113: The following error occurred while 
executing this line:
/home/mshuler/git/cassandra/build.xml:1078: Some unit test(s) failed.

Total time: 9 seconds
((4be9e67...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect bad
4be9e6720d9f94a83aa42153c3e71ae1e557d2d9 is the first bad commit
commit 4be9e6720d9f94a83aa42153c3e71ae1e557d2d9
Author: Aleksey Yeschenko 
Date:   Sun Dec 15 13:29:56 2013 +0300

Improve batchlog write performance with vnodes

patch by Jonathan Ellis and Rick Branson; reviewed by Aleksey Yeschenko
for CASSANDRA-6488

:100644 100644 e5865925f160faabc2506c3a5aac9985c17c1658 
b55393b2ed138011bab52f95f2e9b52107709938 M  CHANGES.txt
:04 04 dea10aa8044e10eb60002e75f2586a9c8e94b647 
7030c09f9713bd3e342e4e012c59b09c86b79a42 M  src
{code}

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849273#comment-13849273
 ] 

Michael Shuler commented on CASSANDRA-6488:
---

I'm working on the cassandra-2.0 branch, since I didn't mention it above. 
Around the same time, LeaveAndBootstrapTest, MoveTest, and RelocateTest were 
new failures - I'm looking at those
- http://cassci.datastax.com/job/cassandra-2.0_test/49/console


> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849275#comment-13849275
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6485:
--

Patch worked on my test. 

> NPE in calculateNaturalEndpoints
> 
>
> Key: CASSANDRA-6485
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Russell Alexander Spitzer
>Assignee: Jonathan Ellis
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6485.txt
>
>
> I was running a test where I added a new data center to an existing cluster. 
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> Although there are no issues with smaller clusters or clusters without 
> vnodes, Larger setups with vnodes seem to consistently see the following 
> exception in the logs as well as a write operation failing for each 
> exception. Usually this happens between 1-8 times during an experiment. 
> The exceptions/failures are Occurring when DC2 is brought online but *before* 
> any alteration of the Keyspace. All of the exceptions are happening on DC1 
> nodes. One of the exceptions occurred on a seed node though this doesn't seem 
> to be the case most of the time. 
> While the test was running, nodetool was run every second to get cluster 
> status. At no time did any nodes report themselves as down. 
> {code}
> ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
> CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
> message.
> system_logs-107.21.186.208/system.log:java.lang.NullPointerException
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
> system_logs-107.21.186.208/system.log-at 
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> system_logs-107.21.186.208/system.log-at 
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> system_logs-107.21.186.208/system.log-at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
> system_logs-107.21.186.208/system.log-at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> system_logs-107.21.186.208/system.log-at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> system_logs-107.21.186.208/system.log-at 
> java.lang.Thread.run(Thread.java:724)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Reopened] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reopened CASSANDRA-6488:
--


So, the caching part. [~jbellis] can you have a look? If not, I will, later, 
but it's potentially 1.2.13 vote-affecting.

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6492) Have server pick query page size by default

2013-12-16 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-6492:
-

 Summary: Have server pick query page size by default
 Key: CASSANDRA-6492
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor


We're almost always going to do a better job picking a page size based on 
sstable stats, than users will guesstimating.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849291#comment-13849291
 ] 

Michael Shuler commented on CASSANDRA-6488:
---

Commit bb09d3c fully passed all the unit tests in cassandra-2.0 branch.
- http://cassci.datastax.com/job/cassandra-2.0_test/47/console

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system

2013-12-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849300#comment-13849300
 ] 

Sylvain Lebresne commented on CASSANDRA-6490:
-

Right, right, fixed.

> Please delete old releases from mirroring system
> 
>
> Key: CASSANDRA-6490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6490
> Project: Cassandra
>  Issue Type: Bug
> Environment: http://www.apache.org/dist/cassandra/
>Reporter: Sebb
>Assignee: Sylvain Lebresne
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> Thanks!
> [Note that older releases are always available from the ASF archive server]
> Any links to older releases on download pages should first be adjusted to 
> point to the archive server.
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849312#comment-13849312
 ] 

Michael Shuler commented on CASSANDRA-6488:
---

Those same tests look like new failures with this commit in cassandra-1.2 
branch also
- http://cassci.datastax.com/job/cassandra-1.2_test/32/console
vs.
- http://cassci.datastax.com/job/cassandra-1.2_test/33/console

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849312#comment-13849312
 ] 

Michael Shuler edited comment on CASSANDRA-6488 at 12/16/13 5:02 PM:
-

Those same tests look like new failures with this commit in cassandra-1.2 
branch also
- http://cassci.datastax.com/job/cassandra-1.2_test/32/console
vs.
- http://cassci.datastax.com/job/cassandra-1.2_test/33/console

(edit for clarity) New unit test failures in c-2.0 and c-1.2 branches with this 
commit:
- BootStrapperTest
- LeaveAndBootstrapTest
- MoveTest
- RelocateTest


was (Author: mshuler):
Those same tests look like new failures with this commit in cassandra-1.2 
branch also
- http://cassci.datastax.com/job/cassandra-1.2_test/32/console
vs.
- http://cassci.datastax.com/job/cassandra-1.2_test/33/console

> Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
> -
>
> Key: CASSANDRA-6488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Rick Branson
>Assignee: Rick Branson
> Fix For: 1.2.13, 2.0.4
>
> Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph 
> (21).png
>
>
> The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
> enormous amounts of CPU to be consumed on clusters with many vnodes. I 
> created a patch to cache this data as a workaround and deployed it to a 
> production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
> highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
> clusters. I'm including the maybe-not-the-best-quality workaround patch to 
> use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
> it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)
Russell Alexander Spitzer created CASSANDRA-6493:


 Summary: Exceptions when a second Datacenter is Added
 Key: CASSANDRA-6493
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu, EC2 M1.large
Reporter: Russell Alexander Spitzer


On adding a second datacenter several exceptions were raised.

Test outline:
Start 25 Node DC1
Keyspace Setup Replication 3
Begin insert against DC1 Using Stress
While the inserts are occuring
Start up 25 Node DC2
Alter Keyspace to include Replication in 2nd DC
Run rebuild on DC2
Wait for stress to finish
Run repair on Cluster
... Some other operations

At the point when the second datacenter is added several warnings go off 
because nodetool status is not functioning, and a few moments later the start 
operation reports a failure because a node has not successfully turned on. 

The first start attempt yielded the following exception on a node in the second 
DC.

{code}
CassandraDaemon.java (line 464) Exception encountered during startup
java.lang.AssertionError: -7560216458456714666 not found in 
-9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
9218575851928340117, 9219681798686280387
at 
org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
at 
org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
{code}

The test automatically tries to restart nodes if they fail during startup, The 
second attempt for this node succeeded but a nodetool still failed and a 
different node in the second DC logged the following and failed to start up.

{code}
ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception 
encountered during startup
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
at 
org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
at 
org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
at 
org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
ERROR [StorageServiceShutdownHook] 2013-12-16 18:02:04,876 CassandraDaemon.java 
(line 191) Exception in thread Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:358)
at 
org.apache.cassandra.service.StorageService.shutdownCli

[jira] [Updated] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6493:
-

Description: 
On adding a second datacenter several exceptions were raised.

Test outline:
Start 25 Node DC1
Keyspace Setup Replication 3
Begin insert against DC1 Using Stress
While the inserts are occuring
Start up 25 Node DC2
Alter Keyspace to include Replication in 2nd DC
Run rebuild on DC2
Wait for stress to finish
Run repair on Cluster
... Some other operations

At the point when the second datacenter is added several warnings go off 
because nodetool status is not functioning, and a few moments later the start 
operation reports a failure because a node has not successfully turned on. 

The first start attempt yielded the following exception on a node in the second 
DC.

{code}
CassandraDaemon.java (line 464) Exception encountered during startup
java.lang.AssertionError: -7560216458456714666 not found in 
-9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
9218575851928340117, 9219681798686280387
at 
org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
at 
org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
{code}

The test automatically tries to restart nodes if they fail during startup, The 
second attempt for this node succeeded but a 'nodetool status' still failed and 
a different node in the second DC logged the following and failed to start up.

{code}
ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception 
encountered during startup
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
at 
org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
at 
org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
at 
org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
at 
org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
ERROR [StorageServiceShutdownHook] 2013-12-16 18:02:04,876 CassandraDaemon.java 
(line 191) Exception in thread Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:358)
at 
org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:373)
at 
org.apache.cassandra.service.StorageService.access$000(StorageService.java:89)
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(Stor

[jira] [Updated] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6493:
-

Reproduced In: 1.2.13

> Exceptions when a second Datacenter is Added
> 
>
> Key: CASSANDRA-6493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, EC2 M1.large
>Reporter: Russell Alexander Spitzer
>
> On adding a second datacenter several exceptions were raised.
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> At the point when the second datacenter is added several warnings go off 
> because nodetool status is not functioning, and a few moments later the start 
> operation reports a failure because a node has not successfully turned on. 
> The first start attempt yielded the following exception on a node in the 
> second DC.
> {code}
> CassandraDaemon.java (line 464) Exception encountered during startup
> java.lang.AssertionError: -7560216458456714666 not found in 
> -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
> 9218575851928340117, 9219681798686280387
> at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
> at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
> at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
> at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> {code}
> The test automatically tries to restart nodes if they fail during startup, 
> The second attempt for this node succeeded but a nodetool still failed and a 
> different node in the second DC logged the following and failed to start up.
> {code}
> ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
>   at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
>   at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
>   at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
>   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
>   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
>   at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
>   at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
>   at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> ERROR [StorageServiceShutdownHook] 20

[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849448#comment-13849448
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6493:
--

https://cassci.datastax.com/job/cassandra-addremovedc/25/console

The "Node down Detected" are messages from a thread which runs nodetool status 
every ~2 seconds and counts how many nodes report themselves as up, the lack of 
a command line output shows the command failed. 

> Exceptions when a second Datacenter is Added
> 
>
> Key: CASSANDRA-6493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, EC2 M1.large
>Reporter: Russell Alexander Spitzer
>
> On adding a second datacenter several exceptions were raised.
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> At the point when the second datacenter is added several warnings go off 
> because nodetool status is not functioning, and a few moments later the start 
> operation reports a failure because a node has not successfully turned on. 
> The first start attempt yielded the following exception on a node in the 
> second DC.
> {code}
> CassandraDaemon.java (line 464) Exception encountered during startup
> java.lang.AssertionError: -7560216458456714666 not found in 
> -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
> 9218575851928340117, 9219681798686280387
> at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
> at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
> at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
> at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> {code}
> The test automatically tries to restart nodes if they fail during startup, 
> The second attempt for this node succeeded but a 'nodetool status' still 
> failed and a different node in the second DC logged the following and failed 
> to start up.
> {code}
> ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
>   at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
>   at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
>   at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
>   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
>   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
>   at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
>   at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
>   at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:

[jira] [Updated] (CASSANDRA-4687) Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk)

2013-12-16 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-4687:
---

Attachment: apache-cassandra-1.2.13-SNAPSHOT.jar
guava-backed-cache.patch

This is the initial patch which uses Guava Cache as a cache storage. The only 
thing which is not functional right now is setCapacity but it's not that 
crucial anyway to figure if that helps to fix the situation.

> Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk)
> ---
>
> Key: CASSANDRA-4687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4687
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: CentOS 6.3 64-bit, Oracle JRE 1.6.0.33 64-bit, single 
> node cluster
>Reporter: Leonid Shalupov
>Priority: Minor
> Attachments: 4687-debugging.txt, 
> apache-cassandra-1.2.13-SNAPSHOT.jar, guava-backed-cache.patch
>
>
> Under heavy write load sometimes cassandra fails with assertion error.
> git bisect leads to commit 295aedb278e7a495213241b66bc46d763fd4ce66.
> works fine if global key/row caches disabled in code.
> {quote}
> java.lang.AssertionError: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk) in 
> /var/lib/cassandra/data/...-he-1-Data.db
>   at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:60)
>   at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
>   at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
>   at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
>   at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
>   at org.apache.cassandra.db.Table.getRow(Table.java:378)
>   at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:819)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1253)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing

2013-12-16 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849672#comment-13849672
 ] 

Nikolai Grigoriev commented on CASSANDRA-6008:
--

I am wondering if it is possible that because of this problem I ended up with 
this 
(http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic)
 issue.

I am constantly having this "This should never happen" problem with I restart 
my 2.0.3 cluster. Out of 6 nodes, if I restart it now for sure at least 2 will 
fail to start because of this condition. And to allow them to start I wipe the 
contents of system.compactions_in_progress table and delete all 
compactions_in_progress directories under my data directories on the node 
affected.

> Getting 'This should never happen' error at startup due to sstables missing
> ---
>
> Key: CASSANDRA-6008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: John Carrino
>Assignee: Tyler Hobbs
> Fix For: 2.0.4
>
> Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 
> 6008-trunk-v1.patch
>
>
> Exception encountered during startup: "Unfinished compactions reference 
> missing sstables. This should never happen since compactions are marked 
> finished before we start removing the old sstables"
> This happens when sstables that have been compacted away are removed, but 
> they still have entries in the system.compactions_in_progress table.
> Normally this should not happen because the entries in 
> system.compactions_in_progress are deleted before the old sstables are 
> deleted.
> However at startup recovery time, old sstables are deleted (NOT BEFORE they 
> are removed from the compactions_in_progress table) and then after that is 
> done it does a truncate using SystemKeyspace.discardCompactionsInProgress
> We ran into a case where the disk filled up and the node died and was bounced 
> and then failed to truncate this table on startup, and then got stuck hitting 
> this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers.
> Maybe on startup we can delete from this table incrementally as we clean 
> stuff up in the same way that compactions delete from this table before they 
> delete old sstables.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-5742) Add command "list snapshots" to nodetool

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-5742:
-

Attachment: JIRA-5742.diff

> Add command "list snapshots" to nodetool
> 
>
> Key: CASSANDRA-5742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5742
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Affects Versions: 1.2.1
>Reporter: Geert Schuring
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: JIRA-5742.diff
>
>
> It would be nice if the nodetool could tell me which snapshots are present on 
> the system instead of me having to browse the filesystem to fetch the names 
> of the snapshots.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing

2013-12-16 Thread John Carrino (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849697#comment-13849697
 ] 

John Carrino commented on CASSANDRA-6008:
-

I'm fine with leaving all the sstables live.  We use our own MVCC and only rely 
on cassandra to do durable writes and use QUORUM to ensure read what you wrote. 
 Is the only point of this table to ensure counters are handled correctly?

Another possible issue may be when doing restore from backup.  If you do a 
shutdown while there are rows in compaction_log and then clear the current 
tables and replace with new ones you will get this error also.



> Getting 'This should never happen' error at startup due to sstables missing
> ---
>
> Key: CASSANDRA-6008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: John Carrino
>Assignee: Tyler Hobbs
> Fix For: 2.0.4
>
> Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 
> 6008-trunk-v1.patch
>
>
> Exception encountered during startup: "Unfinished compactions reference 
> missing sstables. This should never happen since compactions are marked 
> finished before we start removing the old sstables"
> This happens when sstables that have been compacted away are removed, but 
> they still have entries in the system.compactions_in_progress table.
> Normally this should not happen because the entries in 
> system.compactions_in_progress are deleted before the old sstables are 
> deleted.
> However at startup recovery time, old sstables are deleted (NOT BEFORE they 
> are removed from the compactions_in_progress table) and then after that is 
> done it does a truncate using SystemKeyspace.discardCompactionsInProgress
> We ran into a case where the disk filled up and the node died and was bounced 
> and then failed to truncate this table on startup, and then got stuck hitting 
> this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers.
> Maybe on startup we can delete from this table incrementally as we clean 
> stuff up in the same way that compactions delete from this table before they 
> delete old sstables.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-5742) Add command "list snapshots" to nodetool

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-5742:
-

Attachment: new_file.diff

> Add command "list snapshots" to nodetool
> 
>
> Key: CASSANDRA-5742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5742
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Affects Versions: 1.2.1
>Reporter: Geert Schuring
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: JIRA-5742.diff, new_file.diff
>
>
> It would be nice if the nodetool could tell me which snapshots are present on 
> the system instead of me having to browse the filesystem to fetch the names 
> of the snapshots.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6216:
-

Attachment: JIRA-6216.diff

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-5742) Add command "list snapshots" to nodetool

2013-12-16 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5742:
--

Reviewer: Lyuben Todorov

> Add command "list snapshots" to nodetool
> 
>
> Key: CASSANDRA-5742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5742
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Affects Versions: 1.2.1
>Reporter: Geert Schuring
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: JIRA-5742.diff, new_file.diff
>
>
> It would be nice if the nodetool could tell me which snapshots are present on 
> the system instead of me having to browse the filesystem to fetch the names 
> of the snapshots.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6216:
-

Attachment: (was: JIRA-6216.diff)

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6216:
-

Attachment: JIRA-6216.diff

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints

2013-12-16 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849721#comment-13849721
 ] 

sankalp kohli commented on CASSANDRA-6158:
--

[~brandon.williams]
Should I change deleteHintsForEndpoint to be blocking as well? 

> Nodetool command to purge hints
> ---
>
> Key: CASSANDRA-6158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6158
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: trunk-6158.txt
>
>
> The only way to truncate all hints in Cassandra is to truncate the hints CF 
> in system table. 
> It would be cleaner to have a nodetool command for it. Also ability to 
> selectively remove hints by host or DC would also be nice rather than 
> removing all the hints. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5742) Add command "list snapshots" to nodetool

2013-12-16 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849734#comment-13849734
 ] 

sankalp kohli commented on CASSANDRA-5742:
--

[~jbellis]
In this command, I am displaying snapshot and how much true space they are 
taking and how much total space they are taking. 
If I need to add your suggestion 
"Suggest adding a total space used as well (that doesn't double-count multiple 
snapshots of the same file) the way we did for cfstats in"

Then it will be a global value across snapshots and user can get that from 
cfstats. Does that sound reasonable? 

> Add command "list snapshots" to nodetool
> 
>
> Key: CASSANDRA-5742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5742
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Affects Versions: 1.2.1
>Reporter: Geert Schuring
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: JIRA-5742.diff, new_file.diff
>
>
> It would be nice if the nodetool could tell me which snapshots are present on 
> the system instead of me having to browse the filesystem to fetch the names 
> of the snapshots.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6440) Repair should allow repairing particular endpoints to reduce WAN usage.

2013-12-16 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6440:
-

Attachment: JIRA-6440.diff

> Repair should allow repairing particular endpoints to reduce WAN usage. 
> 
>
> Key: CASSANDRA-6440
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6440
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: sankalp kohli
>Priority: Minor
> Attachments: JIRA-6440.diff
>
>
> The way we send out data that does not match over WAN can be improved. 
> Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
> are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
> which other replicas have, then we will have following streams
> 1) A to B and back
> 2) A to C and back(Goes over WAN)
> 3) A to D and back(Goes over WAN)
> One of the ways of doing it to reduce WAN traffic is this.
> 1) Repair A and B only with each other and C and D with each other starting 
> at same time t. 
> 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
> time t. 
> 3) Now run a repair between A and C, the streams which are exchanged as a 
> result of the diff will also be streamed to B and D via A and C(C and D 
> behaves like a proxy to the streams).
> For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
> even more for higher replication factors.
> Another easy way to do this is to have repair command take nodes with which 
> you want to repair with. Then we can do something like this.
> 1) Run repair between (A and B) and (C and D)
> 2) Run repair between (A and C)
> 3) Run repair between (A and B) and (C and D)
> But this will increase the traffic inside the DC as we wont be doing proxy.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-5906) Avoid allocating over-large bloom filters

2013-12-16 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-5906:
--

Attachment: 5906.txt

(also: https://github.com/yukim/cassandra/tree/5906-v3)

Attaching patch for review.

* implemented on top of CASSANDRA-6356
* updated stream-lib to v2.5.1 (latest)

HLL++ parameters are p=13, sp=25 from my observation above.


> Avoid allocating over-large bloom filters
> -
>
> Key: CASSANDRA-5906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5906
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Yuki Morishita
> Fix For: 2.1
>
> Attachments: 5906.txt
>
>
> We conservatively estimate the number of partitions post-compaction to be the 
> total number of partitions pre-compaction.  That is, we assume the worst-case 
> scenario of no partition overlap at all.
> This can result in substantial memory wasted in sstables resulting from 
> highly overlapping compactions.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5742) Add command "list snapshots" to nodetool

2013-12-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849749#comment-13849749
 ] 

Jonathan Ellis commented on CASSANDRA-5742:
---

What I'd like to accomplish is making it more obvious to the user that just 
because he has two snapshots that each take 1GB of space, doesn't mean that 
they take up 2GB combined.  Open to suggestions on implementation details.

> Add command "list snapshots" to nodetool
> 
>
> Key: CASSANDRA-5742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5742
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Affects Versions: 1.2.1
>Reporter: Geert Schuring
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: JIRA-5742.diff, new_file.diff
>
>
> It would be nice if the nodetool could tell me which snapshots are present on 
> the system instead of me having to browse the filesystem to fetch the names 
> of the snapshots.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849854#comment-13849854
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6493:
--

I was able to get the same results repeating the test.
https://cassci.datastax.com/job/cassandra-addremovedc/26/console

> Exceptions when a second Datacenter is Added
> 
>
> Key: CASSANDRA-6493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, EC2 M1.large
>Reporter: Russell Alexander Spitzer
>
> On adding a second datacenter several exceptions were raised.
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> At the point when the second datacenter is added several warnings go off 
> because nodetool status is not functioning, and a few moments later the start 
> operation reports a failure because a node has not successfully turned on. 
> The first start attempt yielded the following exception on a node in the 
> second DC.
> {code}
> CassandraDaemon.java (line 464) Exception encountered during startup
> java.lang.AssertionError: -7560216458456714666 not found in 
> -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
> 9218575851928340117, 9219681798686280387
> at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
> at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
> at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
> at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> {code}
> The test automatically tries to restart nodes if they fail during startup, 
> The second attempt for this node succeeded but a 'nodetool status' still 
> failed and a different node in the second DC logged the following and failed 
> to start up.
> {code}
> ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
>   at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
>   at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
>   at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
>   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
>   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
>   at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
>   at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
>   at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.a

[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints

2013-12-16 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849856#comment-13849856
 ] 

Brandon Williams commented on CASSANDRA-6158:
-

Maybe I'm missing something, but I don't see such a call in this patch?

> Nodetool command to purge hints
> ---
>
> Key: CASSANDRA-6158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6158
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: trunk-6158.txt
>
>
> The only way to truncate all hints in Cassandra is to truncate the hints CF 
> in system table. 
> It would be cleaner to have a nodetool command for it. Also ability to 
> selectively remove hints by host or DC would also be nice rather than 
> removing all the hints. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints

2013-12-16 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849858#comment-13849858
 ] 

Brandon Williams commented on CASSANDRA-6158:
-

Either way though, yes, calls should block until completed.

> Nodetool command to purge hints
> ---
>
> Key: CASSANDRA-6158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6158
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: trunk-6158.txt
>
>
> The only way to truncate all hints in Cassandra is to truncate the hints CF 
> in system table. 
> It would be cleaner to have a nodetool command for it. Also ability to 
> selectively remove hints by host or DC would also be nice rather than 
> removing all the hints. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-4268) Expose full stop() operation through JMX

2013-12-16 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4268:
--

Attachment: 4268_cassandra-2.0.patch

> Expose full stop() operation through JMX
> 
>
> Key: CASSANDRA-4268
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4268
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Lyuben Todorov
>Priority: Minor
>  Labels: jmx
> Fix For: 2.0.4
>
> Attachments: 4268_cassandra-2.0.patch
>
>
> We already expose ways to stop just the RPC server or gossip.  This would 
> fully shutdown the process.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849912#comment-13849912
 ] 

Jonathan Ellis commented on CASSANDRA-6493:
---

>From chat, this does not reproduce when CASSANDRA-6488 is reverted.

> Exceptions when a second Datacenter is Added
> 
>
> Key: CASSANDRA-6493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, EC2 M1.large
>Reporter: Russell Alexander Spitzer
>
> On adding a second datacenter several exceptions were raised.
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> At the point when the second datacenter is added several warnings go off 
> because nodetool status is not functioning, and a few moments later the start 
> operation reports a failure because a node has not successfully turned on. 
> The first start attempt yielded the following exception on a node in the 
> second DC.
> {code}
> CassandraDaemon.java (line 464) Exception encountered during startup
> java.lang.AssertionError: -7560216458456714666 not found in 
> -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
> 9218575851928340117, 9219681798686280387
> at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
> at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
> at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
> at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> {code}
> The test automatically tries to restart nodes if they fail during startup, 
> The second attempt for this node succeeded but a 'nodetool status' still 
> failed and a different node in the second DC logged the following and failed 
> to start up.
> {code}
> ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
>   at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
>   at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
>   at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
>   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
>   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
>   at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
>   at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
>   at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.serv

[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added

2013-12-16 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849925#comment-13849925
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6493:
--

Correct I didn't see this over several runs over the weekend testing on the 
pre-6488 build. Head of the git log from that build

{code}
commit c133ff88982948fdb12669bf766e9848102a3496
Author: Russell Spitzer 
Date:   Fri Dec 13 12:00:53 2013 -0800

Patch to fix NPE ( this is patch a3d91dc9d67572e16d9ad92f22b89eb969373899)

commit 11455738fa61c6eb02895a5a8d3fbbe4d8cb24b4
Author: Brandon Williams 
Date:   Fri Dec 13 12:10:47 2013 -0600

Pig: don't assume all DataBags are DefaultDataBags
Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420
{code}

> Exceptions when a second Datacenter is Added
> 
>
> Key: CASSANDRA-6493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, EC2 M1.large
>Reporter: Russell Alexander Spitzer
>
> On adding a second datacenter several exceptions were raised.
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> At the point when the second datacenter is added several warnings go off 
> because nodetool status is not functioning, and a few moments later the start 
> operation reports a failure because a node has not successfully turned on. 
> The first start attempt yielded the following exception on a node in the 
> second DC.
> {code}
> CassandraDaemon.java (line 464) Exception encountered during startup
> java.lang.AssertionError: -7560216458456714666 not found in 
> -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ...,  
> 9218575851928340117, 9219681798686280387
> at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
> at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
> at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
> at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
> at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
> {code}
> The test automatically tries to restart nodes if they fail during startup, 
> The second attempt for this node succeeded but a 'nodetool status' still 
> failed and a different node in the second DC logged the following and failed 
> to start up.
> {code}
> ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
>   at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
>   at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382)
>   at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696)
>   at 
> org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703)
>   at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187)
>   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147)
>   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121)
>   at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
>   at 
> org.apache.cassandra.service.StorageService.bootstrap(S

[Cassandra Wiki] Update of "ContributorsGroup" by BrandonWilliams

2013-12-16 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ContributorsGroup" page has been changed by BrandonWilliams:
https://wiki.apache.org/cassandra/ContributorsGroup?action=diff&rev1=23&rev2=24

   * mkjellman
   * ono_matope
   * ChrisBurroughs
+  * bhamail
  


[jira] [Commented] (CASSANDRA-6465) DES scores fluctuate too much for cache pinning

2013-12-16 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849949#comment-13849949
 ] 

Robert Coli commented on CASSANDRA-6465:


Are we sure that this mechanism of producing cache pinning is worth the 
complexity here, especially given speculative retry? 

> DES scores fluctuate too much for cache pinning
> ---
>
> Key: CASSANDRA-6465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6465
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 1.2.11, 2 DC cluster
>Reporter: Chris Burroughs
>Assignee: Tyler Hobbs
>Priority: Minor
>  Labels: gossip
> Fix For: 2.0.4
>
> Attachments: des-score-graph.png, des.sample.15min.csv, get-scores.py
>
>
> To quote the conf:
> {noformat}
> # if set greater than zero and read_repair_chance is < 1.0, this will allow
> # 'pinning' of replicas to hosts in order to increase cache capacity.
> # The badness threshold will control how much worse the pinned host has to be
> # before the dynamic snitch will prefer other replicas over it.  This is
> # expressed as a double which represents a percentage.  Thus, a value of
> # 0.2 means Cassandra would continue to prefer the static snitch values
> # until the pinned host was 20% worse than the fastest.
> dynamic_snitch_badness_threshold: 0.1
> {noformat}
> An assumption of this feature is that scores will vary by less than 
> dynamic_snitch_badness_threshold during normal operations.  Attached is the 
> result of polling a node for the scores of 6 different endpoints at 1 Hz for 
> 15 minutes.  The endpoints to sample were chosen with `nodetool getendpoints` 
> for row that is known to get reads.  The node was acting as a coordinator for 
> a few hundred req/second, so it should have sufficient data to work with.  
> Other traces on a second cluster have produced similar results.
>  * The scores vary by far more than I would expect, as show by the difficulty 
> of seeing anything useful in that graph.
>  * The difference between the best and next-best score is usually > 10% 
> (default dynamic_snitch_badness_threshold).
> Neither ClientRequest nor ColumFamily metrics showed wild changes during the 
> data gathering period.
> Attachments:
>  * jython script cobbled together to gather the data (based on work on the 
> mailing list from Maki Watanabe a while back)
>  * csv of DES scores for 6 endpoints, polled about once a second
>  * Attempt at making a graph



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (CASSANDRA-6465) DES scores fluctuate too much for cache pinning

2013-12-16 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849949#comment-13849949
 ] 

Robert Coli edited comment on CASSANDRA-6465 at 12/17/13 12:52 AM:
---

Are we sure that this mechanism of producing cache pinning is worth the 
complexity here, especially given speculative execution? 


was (Author: rcoli):
Are we sure that this mechanism of producing cache pinning is worth the 
complexity here, especially given speculative retry? 

> DES scores fluctuate too much for cache pinning
> ---
>
> Key: CASSANDRA-6465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6465
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 1.2.11, 2 DC cluster
>Reporter: Chris Burroughs
>Assignee: Tyler Hobbs
>Priority: Minor
>  Labels: gossip
> Fix For: 2.0.4
>
> Attachments: des-score-graph.png, des.sample.15min.csv, get-scores.py
>
>
> To quote the conf:
> {noformat}
> # if set greater than zero and read_repair_chance is < 1.0, this will allow
> # 'pinning' of replicas to hosts in order to increase cache capacity.
> # The badness threshold will control how much worse the pinned host has to be
> # before the dynamic snitch will prefer other replicas over it.  This is
> # expressed as a double which represents a percentage.  Thus, a value of
> # 0.2 means Cassandra would continue to prefer the static snitch values
> # until the pinned host was 20% worse than the fastest.
> dynamic_snitch_badness_threshold: 0.1
> {noformat}
> An assumption of this feature is that scores will vary by less than 
> dynamic_snitch_badness_threshold during normal operations.  Attached is the 
> result of polling a node for the scores of 6 different endpoints at 1 Hz for 
> 15 minutes.  The endpoints to sample were chosen with `nodetool getendpoints` 
> for row that is known to get reads.  The node was acting as a coordinator for 
> a few hundred req/second, so it should have sufficient data to work with.  
> Other traces on a second cluster have produced similar results.
>  * The scores vary by far more than I would expect, as show by the difficulty 
> of seeing anything useful in that graph.
>  * The difference between the best and next-best score is usually > 10% 
> (default dynamic_snitch_badness_threshold).
> Neither ClientRequest nor ColumFamily metrics showed wild changes during the 
> data gathering period.
> Attachments:
>  * jython script cobbled together to gather the data (based on work on the 
> mailing list from Maki Watanabe a while back)
>  * csv of DES scores for 6 endpoints, polled about once a second
>  * Attempt at making a graph



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6494) Cassandra refuses to restart due to a corrupted commit log.

2013-12-16 Thread Shao-Chuan Wang (JIRA)
Shao-Chuan Wang created CASSANDRA-6494:
--

 Summary: Cassandra refuses to restart due to a corrupted commit 
log.
 Key: CASSANDRA-6494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6494
 Project: Cassandra
  Issue Type: Bug
Reporter: Shao-Chuan Wang


This is running on our production server. Please advise how to address this 
issue. Thank you!

INFO 02:46:58,879 Finished reading 
/mnt/cassandra/commitlog/CommitLog-3-1386069222785.log
ERROR 02:46:58,879 Exception encountered during startup
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 706167655f74616773 is not defined as a collection
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400)
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273)
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
706167655f74616773 is not defined as a collection
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407)
... 8 more
Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a 
collection
at 
org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35)
at 
edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538)
at 
edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108)
at 
edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192)
at 
edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059)
at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023)
at 
edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985)
at 
org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323)
at 
org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195)
at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196)
at org.apache.cassandra.db.Memtable.put(Memtable.java:160)
at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338)
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 706167655f74616773 is not defined as a collection
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400)
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273)
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:4

[jira] [Created] (CASSANDRA-6496) Endless L0 LCS compactions

2013-12-16 Thread Nikolai Grigoriev (JIRA)
Nikolai Grigoriev created CASSANDRA-6496:


 Summary: Endless L0 LCS compactions
 Key: CASSANDRA-6496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node
Reporter: Nikolai Grigoriev


I have first described the problem here: 
http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic

I think I have really abused my system with the traffic (mix of reads, heavy 
updates and some deletes). Now after stopping the traffic I see the compactions 
that are going on endlessly for over 4 days.

For a specific CF I have about 4700 sstable data files right now.  The 
compaction estimates are logged as "[3312, 4, 0, 0, 0, 0, 0, 0, 0]". 
sstable_size_in_mb=256.  3214 files are about 256Mb (+/1 few megs), other files 
are smaller or much smaller than that. No sstables are larger than 256Mb. What 
I observe is that LCS picks 32 sstables from L0 and compacts them into 32 
sstables of approximately the same size. So, what my system is doing for last 4 
days (no traffic at all) is compacting groups of 32 sstables into groups of 32 
sstables without any changes. Seems like a bug to me regardless of what did I 
do to get the system into this state...




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6495) LOCAL_SERIAL use QUORAM consistency level to validate expected columns

2013-12-16 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-6495:


 Summary: LOCAL_SERIAL  use QUORAM consistency level to validate 
expected columns
 Key: CASSANDRA-6495
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6495
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: sankalp kohli
Priority: Minor


If CAS is done at LOCAL_SERIAL consistency level, only the nodes from the local 
data center should be involved. 
Here we are using QUORAM to validate the expected columns. This will require 
nodes from more than one DC. 
We should use LOCAL_QUORAM here when CAS is done at LOCAL_SERIAL. 

Also if we have 2 DCs with DC1:3,DC2:3, a single DC down will cause CAS to not 
work even for LOCAL_SERIAL. 





--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6496) Endless L0 LCS compactions

2013-12-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850048#comment-13850048
 ] 

Jonathan Ellis commented on CASSANDRA-6496:
---

Can you enable debug logging in o.a.c.db.compaction and post a log sample?

> Endless L0 LCS compactions
> --
>
> Key: CASSANDRA-6496
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6496
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node
>Reporter: Nikolai Grigoriev
>
> I have first described the problem here: 
> http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic
> I think I have really abused my system with the traffic (mix of reads, heavy 
> updates and some deletes). Now after stopping the traffic I see the 
> compactions that are going on endlessly for over 4 days.
> For a specific CF I have about 4700 sstable data files right now.  The 
> compaction estimates are logged as "[3312, 4, 0, 0, 0, 0, 0, 0, 0]". 
> sstable_size_in_mb=256.  3214 files are about 256Mb (+/1 few megs), other 
> files are smaller or much smaller than that. No sstables are larger than 
> 256Mb. What I observe is that LCS picks 32 sstables from L0 and compacts them 
> into 32 sstables of approximately the same size. So, what my system is doing 
> for last 4 days (no traffic at all) is compacting groups of 32 sstables into 
> groups of 32 sstables without any changes. Seems like a bug to me regardless 
> of what did I do to get the system into this state...



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6496) Endless L0 LCS compactions

2013-12-16 Thread Nikolai Grigoriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolai Grigoriev updated CASSANDRA-6496:
-

Attachment: system.log.gz
system.log.1.gz

Attaching the logs. I have enabled the compaction logging this morning to get a 
slight idea of what was going on. 

> Endless L0 LCS compactions
> --
>
> Key: CASSANDRA-6496
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6496
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node
>Reporter: Nikolai Grigoriev
> Attachments: system.log.1.gz, system.log.gz
>
>
> I have first described the problem here: 
> http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic
> I think I have really abused my system with the traffic (mix of reads, heavy 
> updates and some deletes). Now after stopping the traffic I see the 
> compactions that are going on endlessly for over 4 days.
> For a specific CF I have about 4700 sstable data files right now.  The 
> compaction estimates are logged as "[3312, 4, 0, 0, 0, 0, 0, 0, 0]". 
> sstable_size_in_mb=256.  3214 files are about 256Mb (+/1 few megs), other 
> files are smaller or much smaller than that. No sstables are larger than 
> 256Mb. What I observe is that LCS picks 32 sstables from L0 and compacts them 
> into 32 sstables of approximately the same size. So, what my system is doing 
> for last 4 days (no traffic at all) is compacting groups of 32 sstables into 
> groups of 32 sstables without any changes. Seems like a bug to me regardless 
> of what did I do to get the system into this state...



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5201) Cassandra/Hadoop does not support current Hadoop releases

2013-12-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850067#comment-13850067
 ] 

Jonathan Ellis commented on CASSANDRA-5201:
---

[~michaelsembwever]?  [~jeromatron]?

> Cassandra/Hadoop does not support current Hadoop releases
> -
>
> Key: CASSANDRA-5201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5201
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.2.0
>Reporter: Brian Jeltema
>Assignee: Dave Brosius
> Attachments: 5201_a.txt, hadoopCompat.patch
>
>
> Using Hadoop 0.22.0 with Cassandra results in the stack trace below.
> It appears that version 0.21+ changed org.apache.hadoop.mapreduce.JobContext
> from a class to an interface.
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>   at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:103)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:357)
>   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1045)
>   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1042)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1042)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1062)
>   at MyHadoopApp.run(MyHadoopApp.java:163)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>   at MyHadoopApp.main(MyHadoopApp.java:82)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:192)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-2915) Lucene based Secondary Indexes

2013-12-16 Thread Matt Stump (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850175#comment-13850175
 ] 

Matt Stump commented on CASSANDRA-2915:
---

Given that the read before write issues still stand for non-numeric fields (as 
of 4.6), is Lucene based secondary indexes still something we want committed in 
the near term? Do we want to wait until incremental update/stacked segments are 
available for all field types?

Additionally, Lucene, even for near realtime search still imposes a delay 
between when a row is added and when it is query-able which would differ from 
existing behavior; is this something that we can live with?

> Lucene based Secondary Indexes
> --
>
> Key: CASSANDRA-2915
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2915
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: T Jake Luciani
>  Labels: secondary_index
>
> Secondary indexes (of type KEYS) suffer from a number of limitations in their 
> current form:
>- Multiple IndexClauses only work when there is a subset of rows under the 
> highest clause
>- One new column family is created per index this means 10 new CFs for 10 
> secondary indexes
> This ticket will use the Lucene library to implement secondary indexes as one 
> index per CF, and utilize the Lucene query engine to handle multiple index 
> clauses. Also, by using the Lucene we get a highly optimized file format.
> There are a few parallels we can draw between Cassandra and Lucene.
> Lucene indexes segments in memory then flushes them to disk so we can sync 
> our memtable flushes to lucene flushes. Lucene also has optimize() which 
> correlates to our compaction process, so these can be sync'd as well.
> We will also need to correlate column validators to Lucene tokenizers, so the 
> data can be stored properly, the big win in once this is done we can perform 
> complex queries within a column like wildcard searches.
> The downside of this approach is we will need to read before write since 
> documents in Lucene are written as complete documents. For random workloads 
> with lot's of indexed columns this means we need to read the document from 
> the index, update it and write it back.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6497) Iterable CqlPagingRecordReader

2013-12-16 Thread Luca Rosellini (JIRA)
Luca Rosellini created CASSANDRA-6497:
-

 Summary: Iterable CqlPagingRecordReader
 Key: CASSANDRA-6497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6497
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Luca Rosellini
Priority: Minor
 Fix For: 2.1
 Attachments: iterable-CqlPagingRecordReader.diff

The current CqlPagingRecordReader implementation provides a non-standard way of 
iterating over the underlying {{rowIterator}}. It would be nice to have an 
Iterable CqlPagingRecordReader like the one proposed in the attached diff.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader

2013-12-16 Thread Luca Rosellini (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Rosellini updated CASSANDRA-6497:
--

Attachment: iterable-CqlPagingRecordReader.diff

> Iterable CqlPagingRecordReader
> --
>
> Key: CASSANDRA-6497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: Luca Rosellini
>Priority: Minor
> Fix For: 2.1
>
> Attachments: iterable-CqlPagingRecordReader.diff
>
>
> The current CqlPagingRecordReader implementation provides a non-standard way 
> of iterating over the underlying {{rowIterator}}. It would be nice to have an 
> Iterable CqlPagingRecordReader like the one proposed in the attached diff.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader

2013-12-16 Thread Luca Rosellini (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Rosellini updated CASSANDRA-6497:
--

Attachment: (was: iterable-CqlPagingRecordReader.diff)

> Iterable CqlPagingRecordReader
> --
>
> Key: CASSANDRA-6497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: Luca Rosellini
>Priority: Minor
> Fix For: 2.1
>
> Attachments: iterable-CqlPagingRecordReader.diff
>
>
> The current CqlPagingRecordReader implementation provides a non-standard way 
> of iterating over the underlying {{rowIterator}}. It would be nice to have an 
> Iterable CqlPagingRecordReader like the one proposed in the attached diff.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader

2013-12-16 Thread Luca Rosellini (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Rosellini updated CASSANDRA-6497:
--

Attachment: iterable-CqlPagingRecordReader.diff

> Iterable CqlPagingRecordReader
> --
>
> Key: CASSANDRA-6497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Reporter: Luca Rosellini
>Priority: Minor
> Fix For: 2.1
>
> Attachments: iterable-CqlPagingRecordReader.diff
>
>
> The current CqlPagingRecordReader implementation provides a non-standard way 
> of iterating over the underlying {{rowIterator}}. It would be nice to have an 
> Iterable CqlPagingRecordReader like the one proposed in the attached diff.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)