date:20150417

[jira] [Updated] (CASSANDRA-9176) drop out of column finding loop on success for altertable statement w/ drop column

2015-04-17 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9176:

Reviewer: Stefania

> drop out of column finding loop on success for altertable statement w/ drop 
> column
> --
>
> Key: CASSANDRA-9176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9176
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 3.0
>
> Attachments: altertabledrop.txt
>
>
> loop looks for column to drop but doesn't stop when found. add a break.
> (trunk)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8377) Coordinated Commitlog Replay

2015-04-17 Thread Chris Lohfink (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500996#comment-14500996
 ] 

Chris Lohfink edited comment on CASSANDRA-8377 at 4/18/15 1:15 AM:
---

The recovery happens before the node is up it so cannot use the storage proxy. 
I created a jmx operation that provides all the different options to restore a 
commit log.  This gives added benefit of not requiring the setting of system 
properties and restarting to do the restore which I think is preferable.


was (Author: cnlwsu):
The recovery happens before the node is up it cant use the storage proxy so 
created a jmx operation that provides all the different options to restore a 
commit log.  This gives added benefit of not requiring restarts to do a point 
in time restore.

> Coordinated Commitlog Replay
> 
>
> Key: CASSANDRA-8377
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8377
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Nick Bailey
>Assignee: Chris Lohfink
> Fix For: 3.0
>
> Attachments: CASSANDRA-8377.txt
>
>
> Commit log archiving and replay can be used to support point in time restores 
> on a cluster. Unfortunately, at the moment that is only true when the 
> topology of the cluster is exactly the same as when the commitlogs were 
> archived. This is because commitlogs need to be replayed on a node that is a 
> replica for those writes.
> To support replaying commitlogs when the topology has changed we should have 
> a tool that replays the writes in a commitlog as if they were writes from a 
> client and will get coordinated to the correct replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8377) Coordinated Commitlog Replay

2015-04-17 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-8377:
-
Attachment: CASSANDRA-8377.txt

The recovery happens before the node is up it cant use the storage proxy so 
created a jmx operation that provides all the different options to restore a 
commit log.  This gives added benefit of not requiring restarts to do a point 
in time restore.

> Coordinated Commitlog Replay
> 
>
> Key: CASSANDRA-8377
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8377
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Nick Bailey
>Assignee: Chris Lohfink
> Fix For: 3.0
>
> Attachments: CASSANDRA-8377.txt
>
>
> Commit log archiving and replay can be used to support point in time restores 
> on a cluster. Unfortunately, at the moment that is only true when the 
> topology of the cluster is exactly the same as when the commitlogs were 
> archived. This is because commitlogs need to be replayed on a node that is a 
> replica for those writes.
> To support replaying commitlogs when the topology has changed we should have 
> a tool that replays the writes in a commitlog as if they were writes from a 
> client and will get coordinated to the correct replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[8/8] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-04-17 Thread brandonwilliams

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f0d4705e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f0d4705e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f0d4705e

Branch: refs/heads/trunk
Commit: f0d4705e6fd90e865cd9f88d5159e1049a772220
Parents: 4040ba8 c6e4379
Author: Brandon Williams 
Authored: Fri Apr 17 18:43:36 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:43:36 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--

[3/8] cassandra git commit: Fix commit

2015-04-17 Thread brandonwilliams

Fix commit


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/738229bd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/738229bd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/738229bd

Branch: refs/heads/cassandra-2.1
Commit: 738229bd7900b21a83ea322b1e394b8c20c0b82f
Parents: 54140bf
Author: Brandon Williams 
Authored: Fri Apr 17 18:42:38 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:42:38 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/738229bd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index 1642561..e5dbdeb 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -81,7 +81,7 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
+if (preferLocal && 
!Gossiper.instance.isDeadState(Gossiper.instance.getEndpointStateForEndpoint(endpoint))
 && state == ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[7/8] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-04-17 Thread brandonwilliams

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt
src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c6e43798
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c6e43798
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c6e43798

Branch: refs/heads/cassandra-2.1
Commit: c6e4379831958a48f421d394a70ddc307341b5bf
Parents: a925262 738229b
Author: Brandon Williams 
Authored: Fri Apr 17 18:43:29 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:43:29 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--

[1/8] cassandra git commit: Don't initiate snitch reconnection for dead states

2015-04-17 Thread brandonwilliams

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 54140bfde -> 738229bd7
  refs/heads/cassandra-2.1 a92526239 -> c6e437983
  refs/heads/trunk 4040ba8e7 -> f0d4705e6


Don't initiate snitch reconnection for dead states

Patch by brandonwilliams, reviewed by John Alberts for CASSANDRA-7292


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54140bfd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54140bfd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54140bfd

Branch: refs/heads/cassandra-2.1
Commit: 54140bfde562361562489879f720e78e0ea0eac7
Parents: 724384a
Author: Brandon Williams 
Authored: Fri Apr 17 17:36:55 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:36:55 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a492c74..6c546c4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.15:
+ * Don't initiate snitch reconnection for dead states (CASSANDRA-7292)
  * Fix ArrayIndexOutOfBoundsException in CQLSSTableWriter (CASSANDRA-8978)
  * Add shutdown gossip state to prevent timeouts during rolling restarts 
(CASSANDRA-8336)
  * Fix running with java.net.preferIPv6Addresses=true (CASSANDRA-9137)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index d797393..1642561 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -21,10 +21,7 @@ package org.apache.cassandra.locator;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 
-import org.apache.cassandra.gms.ApplicationState;
-import org.apache.cassandra.gms.EndpointState;
-import org.apache.cassandra.gms.IEndpointStateChangeSubscriber;
-import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.gms.*;
 import org.apache.cassandra.net.MessagingService;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -78,13 +75,13 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
-if (preferLocal && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
 reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && state == ApplicationState.INTERNAL_IP)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[6/8] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-04-17 Thread brandonwilliams

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt
src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c6e43798
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c6e43798
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c6e43798

Branch: refs/heads/trunk
Commit: c6e4379831958a48f421d394a70ddc307341b5bf
Parents: a925262 738229b
Author: Brandon Williams 
Authored: Fri Apr 17 18:43:29 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:43:29 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--

[4/8] cassandra git commit: Fix commit

2015-04-17 Thread brandonwilliams

Fix commit


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/738229bd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/738229bd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/738229bd

Branch: refs/heads/trunk
Commit: 738229bd7900b21a83ea322b1e394b8c20c0b82f
Parents: 54140bf
Author: Brandon Williams 
Authored: Fri Apr 17 18:42:38 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:42:38 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/738229bd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index 1642561..e5dbdeb 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -81,7 +81,7 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
+if (preferLocal && 
!Gossiper.instance.isDeadState(Gossiper.instance.getEndpointStateForEndpoint(endpoint))
 && state == ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[2/8] cassandra git commit: Don't initiate snitch reconnection for dead states

2015-04-17 Thread brandonwilliams

Don't initiate snitch reconnection for dead states

Patch by brandonwilliams, reviewed by John Alberts for CASSANDRA-7292


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54140bfd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54140bfd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54140bfd

Branch: refs/heads/trunk
Commit: 54140bfde562361562489879f720e78e0ea0eac7
Parents: 724384a
Author: Brandon Williams 
Authored: Fri Apr 17 17:36:55 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:36:55 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a492c74..6c546c4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.15:
+ * Don't initiate snitch reconnection for dead states (CASSANDRA-7292)
  * Fix ArrayIndexOutOfBoundsException in CQLSSTableWriter (CASSANDRA-8978)
  * Add shutdown gossip state to prevent timeouts during rolling restarts 
(CASSANDRA-8336)
  * Fix running with java.net.preferIPv6Addresses=true (CASSANDRA-9137)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index d797393..1642561 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -21,10 +21,7 @@ package org.apache.cassandra.locator;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 
-import org.apache.cassandra.gms.ApplicationState;
-import org.apache.cassandra.gms.EndpointState;
-import org.apache.cassandra.gms.IEndpointStateChangeSubscriber;
-import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.gms.*;
 import org.apache.cassandra.net.MessagingService;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -78,13 +75,13 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
-if (preferLocal && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
 reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && state == ApplicationState.INTERNAL_IP)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[5/8] cassandra git commit: Fix commit

2015-04-17 Thread brandonwilliams

Fix commit


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/738229bd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/738229bd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/738229bd

Branch: refs/heads/cassandra-2.0
Commit: 738229bd7900b21a83ea322b1e394b8c20c0b82f
Parents: 54140bf
Author: Brandon Williams 
Authored: Fri Apr 17 18:42:38 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 18:42:38 2015 -0500

--
 .../org/apache/cassandra/locator/ReconnectableSnitchHelper.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/738229bd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index 1642561..e5dbdeb 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -81,7 +81,7 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
+if (preferLocal && 
!Gossiper.instance.isDeadState(Gossiper.instance.getEndpointStateForEndpoint(endpoint))
 && state == ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[jira] [Comment Edited] (CASSANDRA-9181) Improve index versus secondary index selection

2015-04-17 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500842#comment-14500842
 ] 

Jeff Jirsa edited comment on CASSANDRA-9181 at 4/17/15 11:04 PM:
-

For what it's worth, I've tested this on 2.0.14 and 3.0 (trunk) using ccm 
(RF=1, N=3), and it appears to be working as intended (if partition key is 
included, it's hitting only one node; if no partition key is included, it's 
hitting all nodes in the cluster).  Your reproduced-in is 2.0.7 - have you seen 
this in more recent versions? 

Attaching logs for review. 

On 2.0.14: https://gist.github.com/jeffjirsa/5c0f63395269a85cdcb2
On trunk: https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b

The data generator for schema/etc is top of 
https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b



was (Author: jjirsa):
For what it's worth, I've tested this on 2.0.14 and 3.0 (trunk), and it appears 
to be working as intended (if partition key is included, it's hitting only one 
node; if no partition key is included, it's hitting all nodes in the cluster).  
Your reproduced-in is 2.0.7 - have you seen this in more recent versions? 

Attaching logs for review. 

On 2.0.14: https://gist.github.com/jeffjirsa/5c0f63395269a85cdcb2
On trunk: https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b

The data generator for schema/etc is top of 
https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b


> Improve index versus secondary index selection
> --
>
> Key: CASSANDRA-9181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9181
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremy Hanna
>  Labels: 2i
> Fix For: 3.0
>
>
> There is a special case for secondary indexes if you always supply the 
> partition key.  For example, if you have a family with ID "a456" which has 6 
> family members and I have a secondary index on first name.  Currently, if I 
> do a query like this "select * from families where id = 'a456' and firstname 
> = 'alowishus';" you can see from a query trace, that it will first scan the 
> entire cluster based on the firstname, then look for the key within that.
> If it's not terribly invasive, I think this would be a valid use case to 
> narrow down the results by key first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9181) Improve index versus secondary index selection

2015-04-17 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500842#comment-14500842
 ] 

Jeff Jirsa commented on CASSANDRA-9181:
---

For what it's worth, I've tested this on 2.0.14 and 3.0 (trunk), and it appears 
to be working as intended (if partition key is included, it's hitting only one 
node; if no partition key is included, it's hitting all nodes in the cluster).  
Your reproduced-in is 2.0.7 - have you seen this in more recent versions? 

Attaching logs for review. 

On 2.0.14: https://gist.github.com/jeffjirsa/5c0f63395269a85cdcb2
On trunk: https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b

The data generator for schema/etc is top of 
https://gist.github.com/jeffjirsa/87e2b95113e3366bc00b


> Improve index versus secondary index selection
> --
>
> Key: CASSANDRA-9181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9181
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremy Hanna
>  Labels: 2i
> Fix For: 3.0
>
>
> There is a special case for secondary indexes if you always supply the 
> partition key.  For example, if you have a family with ID "a456" which has 6 
> family members and I have a secondary index on first name.  Currently, if I 
> do a query like this "select * from families where id = 'a456' and firstname 
> = 'alowishus';" you can see from a query trace, that it will first scan the 
> entire cluster based on the firstname, then look for the key within that.
> If it's not terribly invasive, I think this would be a valid use case to 
> narrow down the results by key first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/4] cassandra git commit: Don't initiate snitch reconnection for dead states

2015-04-17 Thread brandonwilliams

Don't initiate snitch reconnection for dead states

Patch by brandonwilliams, reviewed by John Alberts for CASSANDRA-7292


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9252623
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9252623
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9252623

Branch: refs/heads/trunk
Commit: a925262395f1e735ada0ba35c8e41042be1807fb
Parents: b4fae85
Author: Brandon Williams 
Authored: Fri Apr 17 17:36:55 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:38:44 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9252623/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 80ab11c..2777d79 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -81,6 +81,7 @@
  * Use stdout for progress and stats in sstableloader (CASSANDRA-8982)
  * Correctly identify 2i datadir from older versions (CASSANDRA-9116)
 Merged from 2.0:
+ * Don't initiate snitch reconnection for dead states (CASSANDRA-7292)
  * Fix ArrayIndexOutOfBoundsException in CQLSSTableWriter (CASSANDRA-8978)
  * Add shutdown gossip state to prevent timeouts during rolling restarts 
(CASSANDRA-8336)
  * Fix running with java.net.preferIPv6Addresses=true (CASSANDRA-9137)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9252623/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index d797393..1642561 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -21,10 +21,7 @@ package org.apache.cassandra.locator;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 
-import org.apache.cassandra.gms.ApplicationState;
-import org.apache.cassandra.gms.EndpointState;
-import org.apache.cassandra.gms.IEndpointStateChangeSubscriber;
-import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.gms.*;
 import org.apache.cassandra.net.MessagingService;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -78,13 +75,13 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
-if (preferLocal && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
 reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && state == ApplicationState.INTERNAL_IP)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[1/4] cassandra git commit: Don't initiate snitch reconnection for dead states

2015-04-17 Thread brandonwilliams

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 724384ab0 -> 54140bfde
  refs/heads/cassandra-2.1 b4fae8557 -> a92526239
  refs/heads/trunk 11dfc0253 -> 4040ba8e7


Don't initiate snitch reconnection for dead states

Patch by brandonwilliams, reviewed by John Alberts for CASSANDRA-7292


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54140bfd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54140bfd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54140bfd

Branch: refs/heads/cassandra-2.0
Commit: 54140bfde562361562489879f720e78e0ea0eac7
Parents: 724384a
Author: Brandon Williams 
Authored: Fri Apr 17 17:36:55 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:36:55 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a492c74..6c546c4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.15:
+ * Don't initiate snitch reconnection for dead states (CASSANDRA-7292)
  * Fix ArrayIndexOutOfBoundsException in CQLSSTableWriter (CASSANDRA-8978)
  * Add shutdown gossip state to prevent timeouts during rolling restarts 
(CASSANDRA-8336)
  * Fix running with java.net.preferIPv6Addresses=true (CASSANDRA-9137)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54140bfd/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index d797393..1642561 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -21,10 +21,7 @@ package org.apache.cassandra.locator;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 
-import org.apache.cassandra.gms.ApplicationState;
-import org.apache.cassandra.gms.EndpointState;
-import org.apache.cassandra.gms.IEndpointStateChangeSubscriber;
-import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.gms.*;
 import org.apache.cassandra.net.MessagingService;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -78,13 +75,13 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
-if (preferLocal && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
 reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && state == ApplicationState.INTERNAL_IP)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[4/4] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-04-17 Thread brandonwilliams

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4040ba8e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4040ba8e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4040ba8e

Branch: refs/heads/trunk
Commit: 4040ba8e7bd290881f16d1f7b4161b744998364b
Parents: 11dfc02 a925262
Author: Brandon Williams 
Authored: Fri Apr 17 17:38:56 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:38:56 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4040ba8e/CHANGES.txt
--

[2/4] cassandra git commit: Don't initiate snitch reconnection for dead states

2015-04-17 Thread brandonwilliams

Don't initiate snitch reconnection for dead states

Patch by brandonwilliams, reviewed by John Alberts for CASSANDRA-7292


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9252623
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9252623
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9252623

Branch: refs/heads/cassandra-2.1
Commit: a925262395f1e735ada0ba35c8e41042be1807fb
Parents: b4fae85
Author: Brandon Williams 
Authored: Fri Apr 17 17:36:55 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 17:38:44 2015 -0500

--
 CHANGES.txt | 1 +
 .../apache/cassandra/locator/ReconnectableSnitchHelper.java | 9 +++--
 2 files changed, 4 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9252623/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 80ab11c..2777d79 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -81,6 +81,7 @@
  * Use stdout for progress and stats in sstableloader (CASSANDRA-8982)
  * Correctly identify 2i datadir from older versions (CASSANDRA-9116)
 Merged from 2.0:
+ * Don't initiate snitch reconnection for dead states (CASSANDRA-7292)
  * Fix ArrayIndexOutOfBoundsException in CQLSSTableWriter (CASSANDRA-8978)
  * Add shutdown gossip state to prevent timeouts during rolling restarts 
(CASSANDRA-8336)
  * Fix running with java.net.preferIPv6Addresses=true (CASSANDRA-9137)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9252623/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java 
b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
index d797393..1642561 100644
--- a/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
+++ b/src/java/org/apache/cassandra/locator/ReconnectableSnitchHelper.java
@@ -21,10 +21,7 @@ package org.apache.cassandra.locator;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 
-import org.apache.cassandra.gms.ApplicationState;
-import org.apache.cassandra.gms.EndpointState;
-import org.apache.cassandra.gms.IEndpointStateChangeSubscriber;
-import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.gms.*;
 import org.apache.cassandra.net.MessagingService;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -78,13 +75,13 @@ public class ReconnectableSnitchHelper implements 
IEndpointStateChangeSubscriber
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
-if (preferLocal && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && 
epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
 reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
-if (preferLocal && state == ApplicationState.INTERNAL_IP)
+if (preferLocal && !Gossiper.instance.isDeadState(epState) && state == 
ApplicationState.INTERNAL_IP)
 reconnect(endpoint, value);
 }

[jira] [Commented] (CASSANDRA-7292) Can't seed new node into ring with (public) ip of an old node

2015-04-17 Thread John Alberts (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500811#comment-14500811
 ] 

John Alberts commented on CASSANDRA-7292:
-

I was able to confirm this patch does indeed seem to fix the problem I was 
having.  The patch was built against tag 'cassandra-2.0.11' running on amazon 
linux 2014.03.
[~brandon.williams] Thank you for all of your help with providing a fix for 
this issue.  Can't wait until it's in an official release package.


> Can't seed new node into ring with (public) ip of an old node
> -
>
> Key: CASSANDRA-7292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7292
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.0.7, Ec2MultiRegionSnitch
>Reporter: Juho Mäkinen
>Assignee: Brandon Williams
>  Labels: bootstrap, gossip
> Fix For: 2.0.15, 2.1.5
>
> Attachments: 7292.txt, cassandra-replace-address.log
>
>
> This bug prevents node to return with bootstrap into the cluster with its old 
> ip.
> Scenario: five node ec2 cluster spread into three AZ, all in one region. I'm 
> using Ec2MultiRegionSnitch. Nodes are reported with their public ips (as 
> Ec2MultiRegionSnitch requires)
> I simulated a loss of one node by terminating one instance. nodetool status 
> reported correctly that node was down. Then I launched new instance with the 
> old public ip (i'm using elastic ips) with 
> "Dcassandra.replace_address=IP_ADDRESS" but the new node can't join the 
> cluster:
>  INFO 07:20:43,424 Gathering node replacement information for /54.86.191.30
>  INFO 07:20:43,428 Starting Messaging Service on port 9043
>  INFO 07:20:43,489 Handshaking version with /54.86.171.10
>  INFO 07:20:43,491 Handshaking version with /54.86.187.245
> (some delay)
> ERROR 07:21:14,445 Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
>   at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
>   at 
> org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:419)
>   at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:650)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)
> It does not help if I remove the "Dcassandra.replace_address=IP_ADDRESS" 
> system property. 
> Also it does not help to remove the node with "nodetool removenode" with or 
> without the cassandra.replace_address property.
> I think this is because the node information is preserved in the gossip info 
> as seen this output of "nodetool gossipinfo"
> /54.86.191.30
>   INTERNAL_IP:172.16.1.231
>   DC:us-east
>   REMOVAL_COORDINATOR:REMOVER,d581309a-8610-40d4-ba30-cb250eda22a8
>   STATUS:removed,19311925-46b5-4fe4-928a-321e8adb731d,1401089960664
>   HOST_ID:19311925-46b5-4fe4-928a-321e8adb731d
>   RPC_ADDRESS:0.0.0.0
>   NET_VERSION:7
>   SCHEMA:226f9315-b4b2-32c1-bfe1-f4bb49fccfd5
>   RACK:1b
>   LOAD:7.075290515E9
>   SEVERITY:0.0
>   RELEASE_VERSION:2.0.7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9107) More accurate row count estimates

2015-04-17 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500779#comment-14500779
 ] 

Sam Tunnicliffe commented on CASSANDRA-9107:


Fair enough, +1 from me then.

> More accurate row count estimates
> -
>
> Key: CASSANDRA-9107
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9107
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
> Attachments: 9107-cassandra2-1.patch
>
>
> Currently the estimated row count from cfstats is the sum of the number of 
> rows in all the sstables. This becomes very inaccurate with wide rows or 
> heavily updated datasets since the same partition would exist in many 
> sstables.  In example:
> {code}
> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> create TABLE wide (key text PRIMARY KEY , value text) WITH compaction = 
> {'class': 'SizeTieredCompactionStrategy', 'min_threshold': 30, 
> 'max_threshold': 100} ;
> ---
> insert INTO wide (key, value) VALUES ('key', 'value');
> // flush
> // cfstats output: Number of keys (estimate): 1  (128 in older version from 
> index)
> insert INTO wide (key, value) VALUES ('key', 'value');
> // flush
> // cfstats output: Number of keys (estimate): 2  (256 in older version from 
> index)
> ... etc
> {code}
> previously it used the index but it still did it per sstable and summed them 
> up which became inaccurate as there are more sstables (just by much worse). 
> With new versions of sstables we can merge the cardinalities to resolve this 
> with a slight hit to accuracy in the case of every sstable having completely 
> unique partitions.
> Furthermore I think it would be pretty minimal effort to include the number 
> of rows in the memtables to this count. We wont have the cardinality merging 
> between memtables and sstables but I would consider that a relatively minor 
> negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9107) More accurate row count estimates

2015-04-17 Thread Chris Lohfink (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500746#comment-14500746
 ] 

Chris Lohfink commented on CASSANDRA-9107:
--

I like having the MT count included, when people run some simple small tests it 
will show up then. I think it can confuse people if they insert some data and 
the value doesn't go up.

> More accurate row count estimates
> -
>
> Key: CASSANDRA-9107
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9107
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
> Attachments: 9107-cassandra2-1.patch
>
>
> Currently the estimated row count from cfstats is the sum of the number of 
> rows in all the sstables. This becomes very inaccurate with wide rows or 
> heavily updated datasets since the same partition would exist in many 
> sstables.  In example:
> {code}
> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> create TABLE wide (key text PRIMARY KEY , value text) WITH compaction = 
> {'class': 'SizeTieredCompactionStrategy', 'min_threshold': 30, 
> 'max_threshold': 100} ;
> ---
> insert INTO wide (key, value) VALUES ('key', 'value');
> // flush
> // cfstats output: Number of keys (estimate): 1  (128 in older version from 
> index)
> insert INTO wide (key, value) VALUES ('key', 'value');
> // flush
> // cfstats output: Number of keys (estimate): 2  (256 in older version from 
> index)
> ... etc
> {code}
> previously it used the index but it still did it per sstable and summed them 
> up which became inaccurate as there are more sstables (just by much worse). 
> With new versions of sstables we can merge the cardinalities to resolve this 
> with a slight hit to accuracy in the case of every sstable having completely 
> unique partitions.
> Furthermore I think it would be pretty minimal effort to include the number 
> of rows in the memtables to this count. We wont have the cardinality merging 
> between memtables and sstables but I would consider that a relatively minor 
> negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9196) Do not rebuild indexes if no columns are actually indexed

2015-04-17 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-9196:
---
Attachment: 2.1-CASSANDRA-9196.txt

The patch looks good for 2.0, but it won't work for 2.1/trunk. The reason being 
that {{indexes()}} now takes a {{CellName}} rather than a {{ByteBuffer}} 
containing the column name and we can't construct one in {{maybeRebuildIndex}}. 
I've attached an alternative patch for 2.1+ that adds that an 
{{indexes(ColumnDefinition)}} overload, with the default implementation on 
{{SecondaryIndex}} simply checking if the supplied {{ColumnDefinition}} is 
present in the index's {{columnDefs}}. 

> Do not rebuild indexes if no columns are actually indexed
> -
>
> Key: CASSANDRA-9196
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9196
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Sergio Bossa
>Assignee: Sergio Bossa
> Fix For: 2.0.15
>
> Attachments: 2.0-CASSANDRA-9196.txt, 2.1-CASSANDRA-9196.txt
>
>
> When rebuilding secondary indexes, the index task is executed regardless if 
> the actual {{SecondaryIndex#indexes(ByteBuffer )}} implementation of any 
> index returns true for any column, meaning that the expensive task of going 
> through all sstables and related rows will be executed even if in the end no 
> column/row will be actually indexed.
> This is a huge performance hit when i.e. bootstrapping with large datasets on 
> tables having custom secondary index implementations whose {{indexes()}} 
> implementation might return false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[Cassandra Wiki] Update of "ClientOptions" by KarlLehenbauer

2015-04-17 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ClientOptions" page has been changed by KarlLehenbauer:
https://wiki.apache.org/cassandra/ClientOptions?action=diff&rev1=192&rev2=193

Comment:
add link to Tcl client

* [[https://github.com/jbochi/lua-resty-cassandra|lua-resty-cassandra]]
   * Dart
* [[https://github.com/achilleasa/dart_cassandra_cql|dart_cassandra_cql]]
+  * Tcl
+   * [[https://github.com/flightaware/casstcl|casstcl]]
  
  = Thrift =
  For older Thrift clients, see ClientOptionsThrift.

[jira] [Comment Edited] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500581#comment-14500581
 ] 

Jason Brown edited comment on CASSANDRA-9206 at 4/17/15 8:15 PM:
-

It's when things backup/slow down, and the queue gets deep, that I have a small 
degree of concern about. Remember that the Gossip stage is single-threaded, so 
it's not hard to see that queue backing up.


was (Author: jasobrown):
It's when things backup/slow down, and the queue gets deep, that I have a small 
degree of concern about.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500581#comment-14500581
 ] 

Jason Brown commented on CASSANDRA-9206:


It's when things backup/slow down, and the queue gets deep, that I have a small 
degree of concern about.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500576#comment-14500576
 ] 

Jonathan Ellis commented on CASSANDRA-9206:
---

Is processing 500 gossip messages per second per seed really that big a deal? 

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500571#comment-14500571
 ] 

Jason Brown edited comment on CASSANDRA-9206 at 4/17/15 8:08 PM:
-

Actually, if all we really care about doing is increasing the fanout from a 
fixed 1 to 2 nodes per gossip round (to aid in convergence), we could just 
select two peers at random, rather than always selecting one random peer plus 
one seed (as per the change in this ticket). This way seeds do not get the 
extra load, and we still achieve the increased gossip sessions via a larger 
fanout.

Granted, this would increase the number of gossip sessions per round from a 
current max of 3 to a max of 4, but then gossiping to a seed and an down node 
are probabalistic anyway.


was (Author: jasobrown):
Actually, if all we really care about doing is increasing the fanout from a 
fixed 1 to 2 nodes per gossip round (to aid in convergence), we could just 
select two peers at random, rather than always selecting one random peer plus 
one seed (as per the change in this ticket). This way seeds do not get the 
extra load, and we still achieve the increased gossip sessions via a larger 
fanout.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500571#comment-14500571
 ] 

Jason Brown commented on CASSANDRA-9206:


Actually, if all we really care about doing is increasing the fanout from a 
fixed 1 to 2 nodes per gossip round (to aid in convergence), we could just 
select two peers at random, rather than always selecting one random peer plus 
one seed (as per the change in this ticket). This way seeds do not get the 
extra load, and we still achieve the increased gossip sessions via a larger 
fanout.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500565#comment-14500565
 ] 

Jason Brown commented on CASSANDRA-9206:


TBH, I'm kinda +0 on this ticket. While I agree the original motivation behind 
the probabalistic desire to contact seeds is a bit spurious/funky/undocumented, 
I'm not compltetly convinced adding more traffic will help much in cluster 
convergence. For small clusters (less than 20 nodes), there will be near zero 
impact, so I don't have much problem in that case - but then, they probably 
don't suffer from the problems we're trying to address here. 

However, for larger clusters (greater than 500 nodes), think the extra 
messaging might be an issue. The problem I see is that when things slow down, 
and you have a very low number of seed nodes (i.e. less than 5), the gossip 
messages will back up on those nodes and we'll spend lot of cycles just trying 
to broadcast the same redundant data over and over again. What's worse is that 
the operator won't really have any great insight to discover that gossip (our 
membership dissemination protocol) is contributing to things going weird; and, 
thus, the advice to "add more seeds" isn't obvious nor simple, in some cases. 
(I'm thinking of Netflix's Priam programmed to use up to two nodes per 
availability zone as seeds. It would require a non-trivial effort to change 
that core assumption, fwiw.) Further, in 3.0, we've now split the OTCP by 
message size, not function. Thus, all the excess gossip messages on the seeds 
could start interfering with the normal read/write traffic.

Also, we will not create a spanning tree by increasing the number of nodes 
contacted during a gossip round. What that does is increase the fanout (the 
number of nodes contacted) from a fixed size of 1 to 2. We still have randomly 
selected peers at every step, and not a static nor dynamic tree that covers all 
nodes from a given sender.

Lastly, there is a minor error in the number of messages to be generated: in a 
cluster of 1000 nodes, we will start 1000 more gossip sessions to the seeds, 
and each gossip session is comprised of 3 messages. Thus, the message count is 
3000. If you are actually running a cluster that large, and the network can't 
sustain that extra load, you're probably screwed anyway.

While this might help in convergence (primarily for heartbeat dissemination), 
the trade off is for more (non-directed) traffic. All in all (and thinking 
while I'm typing), this patch is probably fine for the vast majority of use 
cases, and if anything, the clarity in the code that will come from it should 
be worthwhile.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-04-17 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500554#comment-14500554
 ] 

Brandon Williams commented on CASSANDRA-8072:
-

The reason this only manifests with shadow gossip is because shadow is the only 
time we send precisely one round, under normal gossip conditions we fire once 
per second, but probably lose the first message as well.

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Brandon Williams
> Fix For: 2.0.15, 2.1.5
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7292) Can't seed new node into ring with (public) ip of an old node

2015-04-17 Thread John Alberts (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500557#comment-14500557
 ] 

John Alberts commented on CASSANDRA-7292:
-

[~brandon.williams] I was able to get this patch to fix my problem last night 
but tried again today and couldn't reproduce.  I think the db had issues from 
multiple restarts, version switches, etc.  I'm going to start from scratch with 
a new cluster, re-test, and I'll get back to you.


> Can't seed new node into ring with (public) ip of an old node
> -
>
> Key: CASSANDRA-7292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7292
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.0.7, Ec2MultiRegionSnitch
>Reporter: Juho Mäkinen
>Assignee: Brandon Williams
>  Labels: bootstrap, gossip
> Fix For: 2.0.15, 2.1.5
>
> Attachments: 7292.txt, cassandra-replace-address.log
>
>
> This bug prevents node to return with bootstrap into the cluster with its old 
> ip.
> Scenario: five node ec2 cluster spread into three AZ, all in one region. I'm 
> using Ec2MultiRegionSnitch. Nodes are reported with their public ips (as 
> Ec2MultiRegionSnitch requires)
> I simulated a loss of one node by terminating one instance. nodetool status 
> reported correctly that node was down. Then I launched new instance with the 
> old public ip (i'm using elastic ips) with 
> "Dcassandra.replace_address=IP_ADDRESS" but the new node can't join the 
> cluster:
>  INFO 07:20:43,424 Gathering node replacement information for /54.86.191.30
>  INFO 07:20:43,428 Starting Messaging Service on port 9043
>  INFO 07:20:43,489 Handshaking version with /54.86.171.10
>  INFO 07:20:43,491 Handshaking version with /54.86.187.245
> (some delay)
> ERROR 07:21:14,445 Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
>   at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
>   at 
> org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:419)
>   at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:650)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
>   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)
> It does not help if I remove the "Dcassandra.replace_address=IP_ADDRESS" 
> system property. 
> Also it does not help to remove the node with "nodetool removenode" with or 
> without the cassandra.replace_address property.
> I think this is because the node information is preserved in the gossip info 
> as seen this output of "nodetool gossipinfo"
> /54.86.191.30
>   INTERNAL_IP:172.16.1.231
>   DC:us-east
>   REMOVAL_COORDINATOR:REMOVER,d581309a-8610-40d4-ba30-cb250eda22a8
>   STATUS:removed,19311925-46b5-4fe4-928a-321e8adb731d,1401089960664
>   HOST_ID:19311925-46b5-4fe4-928a-321e8adb731d
>   RPC_ADDRESS:0.0.0.0
>   NET_VERSION:7
>   SCHEMA:226f9315-b4b2-32c1-bfe1-f4bb49fccfd5
>   RACK:1b
>   LOAD:7.075290515E9
>   SEVERITY:0.0
>   RELEASE_VERSION:2.0.7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-04-17 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500545#comment-14500545
 ] 

Brandon Williams commented on CASSANDRA-8072:
-

After deep packet inspection, I believe I've found the root non-reconnectable 
snitch part of this issue.  When you decom a node, it never correctly tears 
down its ITC pools, which leaves the other side with a dead OTC pool:

{noformat}
tcp1  0 10.208.8.123:33441  10.208.8.63:7000CLOSE_WAIT  
18401/java  
{noformat}

Now when you try to bootstrap with the same IP, the shadow syn is correctly 
sent and the ack reply is built and queued, but MS tries to use the now default 
OTC pool and the message never makes it back to the node, since it just sends 
RSTs which finally kills the connection.  But since the syn is only sent once, 
the seed has nothing else to send the node and never reestablishes the 
connection, leaving the bootstrapping node thinking it never talked to a seed 
and throwing this error.

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Brandon Williams
> Fix For: 2.0.15, 2.1.5
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-04-17 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500512#comment-14500512
 ] 

Marcus Eriksson commented on CASSANDRA-6696:


bq. Specifically, what vnodes are assigned to what disk? What vnode is an 
sstable responsible for? It should be possible to get that information without 
running sstablemetadata against every sstable file.
yes, we could in theory create sub directories per vnode for example, then we 
would get the sstables very easily. But, again, we can do this after we commit 
this, please create a new ticket that depends on this

> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, performance
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-04-17 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500508#comment-14500508
 ] 

Marcus Eriksson commented on CASSANDRA-6696:


bq. It might be nice to have this be configurable. This one flush per drive 
still results in L0 having sstables that overlap with almost all of L1 on a per 
drive basis. If you flush to X ranges per drive, then you can get some of the 
benefits of more parallel L0->L1 promotion even if you only have one drive.
It might, or you just set up multiple data directories for the same drive. We 
can improve this later, please create a ticket that depends on this.


> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, performance
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-04-17 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500496#comment-14500496
 ] 

Jeremiah Jordan commented on CASSANDRA-6696:


bq. multi threaded flushing - one thread per disk, splits the owned token range 
evenly over the drives

It might be nice to have this be configurable.  This one flush per drive still 
results in L0 having sstables that overlap with almost all of L1 on a per drive 
basis.  If you flush to X ranges per drive, then you can get some of the 
benefits of more parallel L0->L1 promotion even if you only have one drive.

> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, performance
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9210) CQL Tests Should Operate Over All Types (etc)

2015-04-17 Thread Benedict (JIRA)

Benedict created CASSANDRA-9210:
---

 Summary: CQL Tests Should Operate Over All Types (etc)
 Key: CASSANDRA-9210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9210
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict


With some refactoring our CQL tests could cover all possible types for a given 
operation, and potentially different positions in the clustering component of a 
primary key, as well as differing adjacent items of data in the table being 
queried.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500443#comment-14500443
 ] 

Andy Tolbert commented on CASSANDRA-9131:
-

[~benedict], thanks.  I agree if we could have a client timestamp 
implementation in the drivers that is monotonically increasing while avoiding 
the possibility of having operations with the same timestamp would be ideal.  I 
agree with [~slebresne] that a reference algorithm would be very helpful so all 
drivers implement this in a specific way.  I'll bring up this topic to the team.

I've seen it brought up in a number of issues with differing opinions, is the 
current perspective that going forward client-provided timestamps will be 
preferred over server timestamps?  Depending on the perspective I can see it 
being more important for clients to properly implement this.  The current 
behavior in all the drivers to my knowledge is that using client timestamps has 
to be explicit and only the python and java drivers have a way to enable it for 
all queries.

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-04-17 Thread Nick Bailey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500329#comment-14500329
 ] 

Nick Bailey commented on CASSANDRA-6696:


I'd also like to mention that we should consider what the best way to expose 
this new information to operators is. Specifically, what vnodes are assigned to 
what disk? What vnode is an sstable responsible for? It should be possible to 
get that information without running sstablemetadata against every sstable file.

> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, performance
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: rename

2015-04-17 Thread jbellis

Repository: cassandra
Updated Branches:
  refs/heads/trunk e983956c2 -> 11dfc0253


rename


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11dfc025
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11dfc025
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11dfc025

Branch: refs/heads/trunk
Commit: 11dfc025305113f5cfeac4151fb72cee2e6f83f9
Parents: e983956
Author: Jonathan Ellis 
Authored: Fri Apr 17 12:48:28 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 12:48:28 2015 -0500

--
 .../apache/cassandra/config/CFMetaDataTest.java | 150 -
 .../config/LegacySchemaTablesTest.java  | 150 +
 .../org/apache/cassandra/schema/DefsTest.java   | 568 +++
 .../schema/LegacySchemaTablesTest.java  | 568 ---
 4 files changed, 718 insertions(+), 718 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11dfc025/test/unit/org/apache/cassandra/config/CFMetaDataTest.java
--
diff --git a/test/unit/org/apache/cassandra/config/CFMetaDataTest.java 
b/test/unit/org/apache/cassandra/config/CFMetaDataTest.java
deleted file mode 100644
index 5fed5be..000
--- a/test/unit/org/apache/cassandra/config/CFMetaDataTest.java
+++ /dev/null
@@ -1,150 +0,0 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-package org.apache.cassandra.config;
-
-import java.util.ArrayList;
-import java.util.List;
-import java.util.HashMap;
-import java.util.HashSet;
-
-import org.apache.cassandra.SchemaLoader;
-import org.apache.cassandra.db.*;
-import org.apache.cassandra.db.marshal.AsciiType;
-import org.apache.cassandra.db.marshal.UTF8Type;
-import org.apache.cassandra.exceptions.ConfigurationException;
-import org.apache.cassandra.io.compress.*;
-import org.apache.cassandra.locator.SimpleStrategy;
-import org.apache.cassandra.schema.LegacySchemaTables;
-import org.apache.cassandra.service.StorageService;
-import org.apache.cassandra.thrift.CfDef;
-import org.apache.cassandra.thrift.ColumnDef;
-import org.apache.cassandra.thrift.IndexType;
-import org.apache.cassandra.thrift.ThriftConversion;
-import org.apache.cassandra.utils.ByteBufferUtil;
-import org.apache.cassandra.utils.FBUtilities;
-
-import org.junit.BeforeClass;
-import org.junit.Test;
-
-import static org.junit.Assert.assertEquals;
-
-public class CFMetaDataTest
-{
-private static final String KEYSPACE1 = "CFMetaDataTest1";
-private static final String CF_STANDARD1 = "Standard1";
-
-private static List columnDefs = new ArrayList();
-
-static
-{
-columnDefs.add(new ColumnDef(ByteBufferUtil.bytes("col1"), 
AsciiType.class.getCanonicalName())
-.setIndex_name("col1Index")
-.setIndex_type(IndexType.KEYS));
-
-columnDefs.add(new ColumnDef(ByteBufferUtil.bytes("col2"), 
UTF8Type.class.getCanonicalName())
-.setIndex_name("col2Index")
-.setIndex_type(IndexType.KEYS));
-}
-
-@BeforeClass
-public static void defineSchema() throws ConfigurationException
-{
-SchemaLoader.prepareServer();
-SchemaLoader.createKeyspace(KEYSPACE1,
-SimpleStrategy.class,
-KSMetaData.optsWithRF(1),
-SchemaLoader.standardCFMD(KEYSPACE1, 
CF_STANDARD1));
-}
-
-@Test
-public void testThriftConversion() throws Exception
-{
-CfDef cfDef = new 
CfDef().setDefault_validation_class(AsciiType.class.getCanonicalName())
- .setComment("Test comment")
- .setColumn_metadata(columnDefs)
- .setKeyspace(KEYSPACE1)
- .setName(CF_STANDARD1);
-
-// convert Thrift to CFMetaData
-CFMetaData cfMetaData = ThriftConversion.from

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-04-17 Thread Nick Bailey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500283#comment-14500283
 ] 

Nick Bailey commented on CASSANDRA-6696:


So I just want to mention on here that the current approach here isn't going to 
help us much with CASSANDRA-4756.

If you don't update your compaction strategy, sstables will contain data from 
many vnodes so things aren't much different than now. If you do use the new 
compaction strategy, things are slightly better in that levels 1 or higher are 
split per vnode and you could deduplicate that data, but level 0 won't be so 
you'll still be forced to overstream anything in level 0.

We may want to revisit a new approach to CASSANDRA-4756, specifically one that 
isn't compaction strategy specific.

> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, performance
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9209) Add static analysis to report any AutoCloseable objects that are not encapsulated in a try/finally block

2015-04-17 Thread Benedict (JIRA)

Benedict created CASSANDRA-9209:
---

 Summary: Add static analysis to report any AutoCloseable objects 
that are not encapsulated in a try/finally block
 Key: CASSANDRA-9209
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9209
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
 Fix For: 3.0


Shouldn't be too tricky, and would help us potentially avoid a number of bugs. 
A follow up would be to enable optional ref counting (or at least leak 
detection) at run time for AutoCloseable objects, possibly only for those we 
care about, but also possible via bytecode weaving so that we could capture all 
of them without question.

(/cc [~tjake])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500274#comment-14500274
 ] 

Jonathan Ellis commented on CASSANDRA-9206:
---

LGTM

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8801) Decommissioned nodes are willing to rejoin the cluster if restarted

2015-04-17 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8801:
--
Reviewer: Tyler Hobbs

[~thobbs] to review

> Decommissioned nodes are willing to rejoin the cluster if restarted
> ---
>
> Key: CASSANDRA-8801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Eric Stevens
>Assignee: Brandon Williams
> Fix For: 3.0
>
> Attachments: 8801.txt
>
>
> This issue comes from the Cassandra user group.
> If a node which was successfully decommissioned gets restarted with its data 
> directory in tact, it will rejoin the cluster immediately going to {{UN}} and 
> beginning to serve client requests.
> This is wrong - the node has consistency issues, having missed any writes 
> while it was offline because no hinted handoffs were being kept.  And in the 
> best case scenario (it's spotted and remediated immediately), near-100% 
> overstreaming will still occur.
> Also, whatever reasons the operator had for decommissioning the node would 
> presumably still be valid, so this action may threaten cluster stability if 
> the node is underpowered or suffering hardware issues.
> But what elevates this to critical is that if the node had been offline 
> longer than gc_grace_seconds, it may cause permanent and unrecoverable 
> consistency issues due to data resurrection.
> h3. Recommendation:
> A node should remember that it was decommissioned and refuse to rejoin a 
> cluster without at least a -Dflag forcing it to.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500259#comment-14500259
 ] 

Sylvain Lebresne commented on CASSANDRA-9131:
-

bq. if we should offer some sample code

Yes, I suspect it would be appreciated if we were to provide some kind of 
example/reference algorithm for this (maybe just in form of some pseudo-code), 
probably in the protocol spec documentation.

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500227#comment-14500227
 ] 

Brandon Williams commented on CASSANDRA-9206:
-

I did some digging as to why the probability code exists, and wasn't able to 
find much.  It came over the wall with facebook, and there's no mention of it 
in the scuttlebutt paper (nor seeds at all, nor probabilistically gossiping 
with unreachable members) so I'm not sure what the original reasoning was for 
it.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8437) Track digest mismatch ratio

2015-04-17 Thread Benjamin Lerer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-8437:
--
Attachment: CASSANDRA-8437-V3.txt

Changes the patch as suggested and modified NodeTool for backward 
compatibility. 

> Track digest mismatch ratio
> ---
>
> Key: CASSANDRA-8437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8437
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 2.1.5
>
> Attachments: CASSANDRA-8437-V2.txt, CASSANDRA-8437-V3.txt, 
> CASSANDRA-8437.txt
>
>
> I don't believe we track how often read results in a digest mismatch but we 
> should since that could directly impact read performance in practice.
> Once we have that data, it might be that some workloads (write heavy most 
> likely) ends up with enough mismatches that going to the data read is more 
> efficient in practice. What we do about it it step 2 however, but getting the 
> data is easy enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8043) Native Protocol V4

2015-04-17 Thread Michael Penick (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500222#comment-14500222
 ] 

Michael Penick edited comment on CASSANDRA-8043 at 4/17/15 5:21 PM:


[~slebresne] There's no type on the  field  for the new error bodies, 
"Read_failure" or "Write_failure". It doesn't seem to be there for the 
"Read_timeout" or "Write_timeout" either. I like the " is a\(n\) 
\[type\]" format of the other fields and  is an outlier in that 
regard.


was (Author: mpenick):
[~slebresne] There's no type on the  field  for the new error bodies, 
"Read_failure" or "Write_failure". It doesn't seem to be there for the 
"Read_timeout" or "Write_timeout" either. I like the " is a(n) [type]" 
format of the other fields and  is an outlier in that regard.

> Native Protocol V4
> --
>
> Key: CASSANDRA-8043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8043
> Project: Cassandra
>  Issue Type: Task
>Reporter: Sylvain Lebresne
>  Labels: client-impacting, protocolv4
> Fix For: 3.0
>
>
> We have a bunch of issues that will require a protocol v4, this ticket is 
> just a meta ticket to group them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8043) Native Protocol V4

2015-04-17 Thread Michael Penick (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500222#comment-14500222
 ] 

Michael Penick commented on CASSANDRA-8043:
---

[~slebresne] There's no type on the  field  for the new error bodies, 
"Read_failure" or "Write_failure". It doesn't seem to be there for the 
"Read_timeout" or "Write_timeout" either. I like the " is a(n) [type]" 
format of the other fields and  is an outlier in that regard.

> Native Protocol V4
> --
>
> Key: CASSANDRA-8043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8043
> Project: Cassandra
>  Issue Type: Task
>Reporter: Sylvain Lebresne
>  Labels: client-impacting, protocolv4
> Fix For: 3.0
>
>
> We have a bunch of issues that will require a protocol v4, this ticket is 
> just a meta ticket to group them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Git Push Summary

2015-04-17 Thread jake

Repository: cassandra
Updated Tags:  refs/tags/cassandra-2.1.5-tentative [created] b4fae8557

Git Push Summary

2015-04-17 Thread jake

Repository: cassandra
Updated Tags:  refs/tags/cassandra-2.1.5-tentative [deleted] 3c17ac6e1

[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-04-17 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500199#comment-14500199
 ] 

Sylvain Lebresne commented on CASSANDRA-8609:
-

[~philipthompson] So is this just a duplicate of CASSANDRA-8358 or will we need 
something more for this?

> Remove depency of hadoop to internals (Cell/CellName)
> -
>
> Key: CASSANDRA-8609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Philip Thompson
> Fix For: 3.0
>
> Attachments: CASSANDRA-8609-3.0-branch.txt
>
>
> For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
> CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
> is entirely artificial: all this code is really client code that communicate 
> with Cassandra over thrift/native protocol and there is thus no reason for it 
> to use internal classes. And in fact, thoses classes are used in a very crude 
> way, as a {{Pair}} really.
> But this dependency is really painful when we make changes to the internals. 
> Further, every time we do so, I believe we break some of those the APIs due 
> to the change. This has been painful for CASSANDRA-5417 and this is now 
> painful for CASSANDRA-8099. But while I somewhat hack over it in 
> CASSANDRA-5417, this was a mistake and we should have removed the depency 
> back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9208) Setting rpc_interface in cassandra.yaml causes NPE during startup

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-9208.
-
Resolution: Duplicate

> Setting rpc_interface in cassandra.yaml causes NPE during startup
> -
>
> Key: CASSANDRA-9208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9208
> Project: Cassandra
>  Issue Type: Bug
>  Components: Config
> Environment: Windows and RHEL
>Reporter: Sandeep More
> Attachments: SuggestedDataBaseDescriptor.diff
>
>
> In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE 
> (stack trace at the end).
> Upon further investigation it turns out that there is a serious problem is in 
> the way this logic is handled in the code DatabaseDescriptor.java (#374).
> Following is the code snippet 
>  else if (conf.rpc_interface != null)
> {
> listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
> "rpc_interface");
> }
> else
> {
> rpcAddress = FBUtilities.getLocalAddress();
> }
> If you notice, 
> 1) The code above sets the "listenAddress" instead of "rpcAddress".  
> 2) The function getNetworkInterfaceAddress() blindly assumes that this is 
> called to set the "listenAddress" (see line 171). The "configName" variable 
> passed to the function is royally ignored and only used for printing out 
> exception (which again is misleading)
> I am also attaching a suggested patch (NOTE: the patch tries to address this 
> issue, the function getNetworkInterfaceAddress() needs revision ).
> INFO  15:36:56 Windows environment detected.  DiskAccessMode set to standard, 
> indexAccessMode standard
> INFO  15:36:56 Global memtable on-heap threshold is enabled at 503MB
> INFO  15:36:56 Global memtable off-heap threshold is enabled at 503MB
> ERROR 15:37:50 Fatal error during configuration loading
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:411)
>  ~[apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:164) 
> [apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533)
>  [apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622) 
> [apache-cassandra-2.1.4.jar:2.1.4]
> null
> Fatal error during configuration loading; unable to start. See log for 
> stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9208) Setting rpc_interface in cassandra.yaml causes NPE during startup

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-9208:

Labels:   (was: easyfix patch)

> Setting rpc_interface in cassandra.yaml causes NPE during startup
> -
>
> Key: CASSANDRA-9208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9208
> Project: Cassandra
>  Issue Type: Bug
>  Components: Config
> Environment: Windows and RHEL
>Reporter: Sandeep More
> Attachments: SuggestedDataBaseDescriptor.diff
>
>
> In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE 
> (stack trace at the end).
> Upon further investigation it turns out that there is a serious problem is in 
> the way this logic is handled in the code DatabaseDescriptor.java (#374).
> Following is the code snippet 
>  else if (conf.rpc_interface != null)
> {
> listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
> "rpc_interface");
> }
> else
> {
> rpcAddress = FBUtilities.getLocalAddress();
> }
> If you notice, 
> 1) The code above sets the "listenAddress" instead of "rpcAddress".  
> 2) The function getNetworkInterfaceAddress() blindly assumes that this is 
> called to set the "listenAddress" (see line 171). The "configName" variable 
> passed to the function is royally ignored and only used for printing out 
> exception (which again is misleading)
> I am also attaching a suggested patch (NOTE: the patch tries to address this 
> issue, the function getNetworkInterfaceAddress() needs revision ).
> INFO  15:36:56 Windows environment detected.  DiskAccessMode set to standard, 
> indexAccessMode standard
> INFO  15:36:56 Global memtable on-heap threshold is enabled at 503MB
> INFO  15:36:56 Global memtable off-heap threshold is enabled at 503MB
> ERROR 15:37:50 Fatal error during configuration loading
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:411)
>  ~[apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:164) 
> [apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533)
>  [apache-cassandra-2.1.4.jar:2.1.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622) 
> [apache-cassandra-2.1.4.jar:2.1.4]
> null
> Fatal error during configuration loading; unable to start. See log for 
> stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500186#comment-14500186
 ] 

Benedict commented on CASSANDRA-9131:
-

[~andrew.tolbert]: if you feel like taking a look at CASSANDRA-6106, this could 
be a useful approach for the java-driver (and be adapted to other drivers). 
Whether or not it uses the microsecond time is kind of irrelevant to the point 
at hand (although potentially also helpful in itself), but the approach to 
staggering the time corrections is very applicable. This would prevent the 
"only have fewer than 1000 inserts in a leap second" problem, because the 1s 
shift backwards in time would be spread over the proceeding minute, with each 
second taking around 20ms longer to elapse than they otherwise would. Either my 
or Sylvain's method of updating the clock time would suffice, and be a 
tremendous improvement to behaviour here in the drivers.

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Fix sigar message about swap

2015-04-17 Thread brandonwilliams

Repository: cassandra
Updated Branches:
  refs/heads/trunk ae3edb2ab -> e983956c2


Fix sigar message about swap


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e983956c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e983956c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e983956c

Branch: refs/heads/trunk
Commit: e983956c2883c8b7c85f7f6700c19ffd9a4a7e54
Parents: ae3edb2
Author: Brandon Williams 
Authored: Fri Apr 17 11:55:38 2015 -0500
Committer: Brandon Williams 
Committed: Fri Apr 17 11:55:46 2015 -0500

--
 src/java/org/apache/cassandra/utils/SigarLibrary.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e983956c/src/java/org/apache/cassandra/utils/SigarLibrary.java
--
diff --git a/src/java/org/apache/cassandra/utils/SigarLibrary.java 
b/src/java/org/apache/cassandra/utils/SigarLibrary.java
index be85977..7cf4d71 100644
--- a/src/java/org/apache/cassandra/utils/SigarLibrary.java
+++ b/src/java/org/apache/cassandra/utils/SigarLibrary.java
@@ -140,11 +140,11 @@ public class SigarLibrary
 long swapSize = swap.getTotal();
 if (swapSize > 0)
 {
-return false;
+return true;
 }
 else
 {
-return true;
+return false;
 }
 }
 catch (SigarException sigarException)

[jira] [Resolved] (CASSANDRA-8766) SSTableRewriter opens all sstables as early before completing the compaction

2015-04-17 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-8766.
-
Resolution: Duplicate

> SSTableRewriter opens all sstables as early before completing the compaction
> 
>
> Key: CASSANDRA-8766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.1.5
>
>
> In CASSANDRA-8320, we made the rewriter call switchWriter() inside of 
> finish(); in CASSANDRA-8124 was made switchWriter() open its data as EARLY. 
> This combination means we no longer honour disabling of early opening, which 
> is potentially a problem on windows for the deletion of the contents (which 
> is why we disable early opening on Windows).
> I've commented on CASSANDRA-8124, as I suspect I'm missing something about 
> this. Although I have no doubt the old behaviour of opening as TMP file 
> reduced the window for problems, and opening as TMPLINK now does the same, 
> it's not entirely clear to me its the right fix (though it may be) since we 
> shouldn't be susceptible to this window anyway? Either way, we perhaps need 
> to come up with something else, because this could potentially break windows 
> support. Perhaps if we simply did not swap in the TMPLINK file so that it 
> never actually get mapped, it would perhaps be enough. [~JoshuaMcKenzie], 
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-9206:

Attachment: (was: 9206.txt)

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-9206:

Attachment: 9206.txt

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500127#comment-14500127
 ] 

Andy Tolbert edited comment on CASSANDRA-9131 at 4/17/15 4:36 PM:
--

{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support protocol 
v3 have a mechanism on the client specifying the timestamp (they could also use 
'USING TIMESTAMP' i suppose), so it will be up to user / time implementation.  
Most of the drivers should have an active means of setting client timestamp 
through API by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reuse the same timestamp, so some 
timestamps may not be distinct.  Possible problem if > 1000 entries during leap 
second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])


was (Author: andrew.tolbert):
{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support protocol 
v3 have a mechanism on the client specifying the timestamp (they could also use 
'USING TIMESTAMP' i suppose), so it will be up to user / time implementation.  
Most of the drivers should have an active means of setting client timestamp 
through API by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the b

[jira] [Comment Edited] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500127#comment-14500127
 ] 

Andy Tolbert edited comment on CASSANDRA-9131 at 4/17/15 4:28 PM:
--

{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support protocol 
v3 have a mechanism on the client specifying the timestamp (they could also use 
'USING TIMESTAMP' i suppose), so it will be up to user / time implementation.  
Most of the drivers should have an active means of setting client timestamp 
through API by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])


was (Author: andrew.tolbert):
{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support protocol 
v3 have a mechanism on the client specifying the timestamp (they could also use 
'USING TIMESTAMP' i suppose), so it will be up to user / time implementation.  
Most of the drivers will have an active means of setting client timestamp 
through API by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the be

[jira] [Comment Edited] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500127#comment-14500127
 ] 

Andy Tolbert edited comment on CASSANDRA-9131 at 4/17/15 4:27 PM:
--

{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support protocol 
v3 have a mechanism on the client specifying the timestamp (they could also use 
'USING TIMESTAMP' i suppose), so it will be up to user / time implementation.  
Most of the drivers will have an active means of setting client timestamp 
through API by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])


was (Author: andrew.tolbert):
{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support 2.1 have a 
mechanism of the client specifying the timestamp (they could also use 'USING 
TIMESTAMP' i suppose), so it will be up to user / time implementation.  Most of 
the drivers will have an active means of setting client timestamp through API 
by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of

[6/6] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-04-17 Thread jbellis

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ae3edb2a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ae3edb2a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ae3edb2a

Branch: refs/heads/trunk
Commit: ae3edb2abee8c847b4b76349ba5dece54450ebac
Parents: 0f72f79 b4fae85
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:24:27 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:27 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ae3edb2a/src/java/org/apache/cassandra/utils/MurmurHash.java
--

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500127#comment-14500127
 ] 

Andy Tolbert commented on CASSANDRA-9131:
-

{quote}
if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
   if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it
{quote}

This is something I'm actively looking at on the drivers side for each driver.  
As of now only the python-driver and java-driver have a mechanism to enable 
automatically set client timestamps.  The other drivers that support 2.1 have a 
mechanism of the client specifying the timestamp (they could also use 'USING 
TIMESTAMP' i suppose), so it will be up to user / time implementation.  Most of 
the drivers will have an active means of setting client timestamp through API 
by June 30th 2015.

* python-driver: [Session 
use_client_timestamp|http://datastax.github.io/python-driver/api/cassandra/cluster.html?highlight=timestamp#cassandra.cluster.Session.use_client_timestamp]
 not monotonic, uses time.time().
* java-driver: 
[TimestampGenerator|http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/TimestampGenerator.html].
  All client implementations are monotonic, but client can provide their own.  
If > 999 entries for same millisecond, will reloop over the same millisecond, 
so its not completely monotonic.  Possible problem if > 1000 entries during 
leap second 
([code|https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/AbstractTimestampGenerator.java])

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581

[4/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-04-17 Thread jbellis

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b4fae855
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b4fae855
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b4fae855

Branch: refs/heads/trunk
Commit: b4fae85578b1bd31d162be9cb58b03c0be9f853f
Parents: 5d88ff4 724384a
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:24:19 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:19 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b4fae855/src/java/org/apache/cassandra/utils/MurmurHash.java
--

[3/6] cassandra git commit: comment Murmur incompatibility

2015-04-17 Thread jbellis

comment Murmur incompatibility


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/724384ab
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/724384ab
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/724384ab

Branch: refs/heads/trunk
Commit: 724384ab05e4a6bf3cacb1732641968d37e8c391
Parents: 9bbcbf5
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:23:46 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:07 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/724384ab/src/java/org/apache/cassandra/utils/MurmurHash.java
--
diff --git a/src/java/org/apache/cassandra/utils/MurmurHash.java 
b/src/java/org/apache/cassandra/utils/MurmurHash.java
index 9dcde6d..c02fdcc 100644
--- a/src/java/org/apache/cassandra/utils/MurmurHash.java
+++ b/src/java/org/apache/cassandra/utils/MurmurHash.java
@@ -24,8 +24,10 @@ import java.nio.ByteBuffer;
  * lookup. See http://murmurhash.googlepages.com/ for more details.
  *
  * hash32() and hash64() are MurmurHash 2.0.
- * hash3_x64_128() is MurmurHash 3.0.
  *
+ * hash3_x64_128() is *almost* MurmurHash 3.0.  It was supposed to match, but 
we didn't catch a sign bug with
+ * the result that it doesn't.  Unfortunately, we can't change it now without 
breaking Murmur3Partitioner. *
+ * 
  * 
  * The C version of MurmurHash 2.0 found at that site was ported to Java by
  * Andrzej Bialecki (ab at getopt org).

[2/6] cassandra git commit: comment Murmur incompatibility

2015-04-17 Thread jbellis

comment Murmur incompatibility


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/724384ab
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/724384ab
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/724384ab

Branch: refs/heads/cassandra-2.1
Commit: 724384ab05e4a6bf3cacb1732641968d37e8c391
Parents: 9bbcbf5
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:23:46 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:07 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/724384ab/src/java/org/apache/cassandra/utils/MurmurHash.java
--
diff --git a/src/java/org/apache/cassandra/utils/MurmurHash.java 
b/src/java/org/apache/cassandra/utils/MurmurHash.java
index 9dcde6d..c02fdcc 100644
--- a/src/java/org/apache/cassandra/utils/MurmurHash.java
+++ b/src/java/org/apache/cassandra/utils/MurmurHash.java
@@ -24,8 +24,10 @@ import java.nio.ByteBuffer;
  * lookup. See http://murmurhash.googlepages.com/ for more details.
  *
  * hash32() and hash64() are MurmurHash 2.0.
- * hash3_x64_128() is MurmurHash 3.0.
  *
+ * hash3_x64_128() is *almost* MurmurHash 3.0.  It was supposed to match, but 
we didn't catch a sign bug with
+ * the result that it doesn't.  Unfortunately, we can't change it now without 
breaking Murmur3Partitioner. *
+ * 
  * 
  * The C version of MurmurHash 2.0 found at that site was ported to Java by
  * Andrzej Bialecki (ab at getopt org).

[5/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-04-17 Thread jbellis

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b4fae855
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b4fae855
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b4fae855

Branch: refs/heads/cassandra-2.1
Commit: b4fae85578b1bd31d162be9cb58b03c0be9f853f
Parents: 5d88ff4 724384a
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:24:19 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:19 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b4fae855/src/java/org/apache/cassandra/utils/MurmurHash.java
--

[1/6] cassandra git commit: comment Murmur incompatibility

2015-04-17 Thread jbellis

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 9bbcbf505 -> 724384ab0
  refs/heads/cassandra-2.1 5d88ff4e4 -> b4fae8557
  refs/heads/trunk 0f72f79d5 -> ae3edb2ab


comment Murmur incompatibility


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/724384ab
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/724384ab
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/724384ab

Branch: refs/heads/cassandra-2.0
Commit: 724384ab05e4a6bf3cacb1732641968d37e8c391
Parents: 9bbcbf5
Author: Jonathan Ellis 
Authored: Fri Apr 17 11:23:46 2015 -0500
Committer: Jonathan Ellis 
Committed: Fri Apr 17 11:24:07 2015 -0500

--
 src/java/org/apache/cassandra/utils/MurmurHash.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/724384ab/src/java/org/apache/cassandra/utils/MurmurHash.java
--
diff --git a/src/java/org/apache/cassandra/utils/MurmurHash.java 
b/src/java/org/apache/cassandra/utils/MurmurHash.java
index 9dcde6d..c02fdcc 100644
--- a/src/java/org/apache/cassandra/utils/MurmurHash.java
+++ b/src/java/org/apache/cassandra/utils/MurmurHash.java
@@ -24,8 +24,10 @@ import java.nio.ByteBuffer;
  * lookup. See http://murmurhash.googlepages.com/ for more details.
  *
  * hash32() and hash64() are MurmurHash 2.0.
- * hash3_x64_128() is MurmurHash 3.0.
  *
+ * hash3_x64_128() is *almost* MurmurHash 3.0.  It was supposed to match, but 
we didn't catch a sign bug with
+ * the result that it doesn't.  Unfortunately, we can't change it now without 
breaking Murmur3Partitioner. *
+ * 
  * 
  * The C version of MurmurHash 2.0 found at that site was ported to Java by
  * Andrzej Bialecki (ab at getopt org).

[jira] [Commented] (CASSANDRA-8718) nodetool cleanup causes segfault

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500122#comment-14500122
 ] 

Benedict commented on CASSANDRA-8718:
-

This is most likely a resource cleanup issue, with accessing offheap memory 
that has been freed as you suggest (the error printed in the stdout is more 
helpful, since it's clearly in the middle of the offheap binarySearch). Which 
means most likely either a double decrement of refcounts, or not taking a 
reference somewhere. Since this is being thrown in Compaction, the latter is 
actually always true (ie we never take a separate reference), so I would 
suspect what is happening is cleanup releases references even if a compaction 
is operating on the sstables, or perhaps doesn't properly mark the sstable 
compacting first, or something along those lines.


> nodetool cleanup causes segfault
> 
>
> Key: CASSANDRA-8718
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8718
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Maxim Ivanov
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.0.15
>
> Attachments: java_hs_err.log
>
>
> When doing cleanup on C* 2.0.12 following error crashes the java process:
> {code}
>  INFO 17:59:02,800 Cleaning up 
> SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db')
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 
> 1.7.0_71-b14)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # J 2655 C2 
> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I
>  (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /var/lib/cassandra_prod/hs_err_pid28039.log
> Compiled method (c2) 913167265 4849 
> org.apache.cassandra.dht.Token::maxKeyBound (24 bytes)
>  total in heap  [0x7f7508572450,0x7f7508573318] = 3784
>  relocation [0x7f7508572570,0x7f7508572618] = 168
>  main code  [0x7f7508572620,0x7f7508572cc0] = 1696
>  stub code  [0x7f7508572cc0,0x7f7508572cf8] = 56
>  oops   [0x7f7508572cf8,0x7f7508572d90] = 152
>  scopes data[0x7f7508572d90,0x7f7508573118] = 904
>  scopes pcs [0x7f7508573118,0x7f7508573268] = 336
>  dependencies   [0x7f7508573268,0x7f7508573280] = 24
>  handler table  [0x7f7508573280,0x7f75085732e0] = 96
>  nul chk table  [0x7f75085732e0,0x7f7508573318] = 56
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8766) SSTableRewriter opens all sstables as early before completing the compaction

2015-04-17 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500096#comment-14500096
 ] 

Joshua McKenzie commented on CASSANDRA-8766:


I believe we resolved this as part of CASSANDRA-8535:
{code:title=switchWriter}
// If early re-open is disabled, simply finalize the writer and store it
if (preemptiveOpenInterval == Long.MAX_VALUE)
{
  SSTableReader reader = writer.finish(SSTableWriter.FinishType.NORMAL, 
maxAge, -1);
  finishedReaders.add(reader);
}
{code}

> SSTableRewriter opens all sstables as early before completing the compaction
> 
>
> Key: CASSANDRA-8766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.1.5
>
>
> In CASSANDRA-8320, we made the rewriter call switchWriter() inside of 
> finish(); in CASSANDRA-8124 was made switchWriter() open its data as EARLY. 
> This combination means we no longer honour disabling of early opening, which 
> is potentially a problem on windows for the deletion of the contents (which 
> is why we disable early opening on Windows).
> I've commented on CASSANDRA-8124, as I suspect I'm missing something about 
> this. Although I have no doubt the old behaviour of opening as TMP file 
> reduced the window for problems, and opening as TMPLINK now does the same, 
> it's not entirely clear to me its the right fix (though it may be) since we 
> shouldn't be susceptible to this window anyway? Either way, we perhaps need 
> to come up with something else, because this could potentially break windows 
> support. Perhaps if we simply did not swap in the TMPLINK file so that it 
> never actually get mapped, it would perhaps be enough. [~JoshuaMcKenzie], 
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-8718) nodetool cleanup causes segfault

2015-04-17 Thread Joshua McKenzie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie reassigned CASSANDRA-8718:
--

Assignee: Joshua McKenzie

> nodetool cleanup causes segfault
> 
>
> Key: CASSANDRA-8718
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8718
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Maxim Ivanov
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.0.15
>
> Attachments: java_hs_err.log
>
>
> When doing cleanup on C* 2.0.12 following error crashes the java process:
> {code}
>  INFO 17:59:02,800 Cleaning up 
> SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db')
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 
> 1.7.0_71-b14)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # J 2655 C2 
> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I
>  (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /var/lib/cassandra_prod/hs_err_pid28039.log
> Compiled method (c2) 913167265 4849 
> org.apache.cassandra.dht.Token::maxKeyBound (24 bytes)
>  total in heap  [0x7f7508572450,0x7f7508573318] = 3784
>  relocation [0x7f7508572570,0x7f7508572618] = 168
>  main code  [0x7f7508572620,0x7f7508572cc0] = 1696
>  stub code  [0x7f7508572cc0,0x7f7508572cf8] = 56
>  oops   [0x7f7508572cf8,0x7f7508572d90] = 152
>  scopes data[0x7f7508572d90,0x7f7508573118] = 904
>  scopes pcs [0x7f7508573118,0x7f7508573268] = 336
>  dependencies   [0x7f7508573268,0x7f7508573280] = 24
>  handler table  [0x7f7508573280,0x7f75085732e0] = 96
>  nul chk table  [0x7f75085732e0,0x7f7508573318] = 56
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8718) nodetool cleanup causes segfault

2015-04-17 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500089#comment-14500089
 ] 

Joshua McKenzie commented on CASSANDRA-8718:


[~philipthompson]
Looks like the crash occurs while getting information for our index scan 
position, likely during access of off-heap memory since the rest of 
getIndexScanPosition is pretty innocuous. I believe a full memory dump would be 
necessary to get more visibility into what's gone wrong, though JDK crash dumps 
aren't my forte ([~benedict] - care to sanity check?)

[~srspnda] [~rossmohax]: Were either of you able to get more information about 
this error? Updates to JDK / C* have any impact on this error presenting?

> nodetool cleanup causes segfault
> 
>
> Key: CASSANDRA-8718
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8718
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Maxim Ivanov
>Priority: Minor
> Fix For: 2.0.15
>
> Attachments: java_hs_err.log
>
>
> When doing cleanup on C* 2.0.12 following error crashes the java process:
> {code}
>  INFO 17:59:02,800 Cleaning up 
> SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db')
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 
> 1.7.0_71-b14)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # J 2655 C2 
> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I
>  (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /var/lib/cassandra_prod/hs_err_pid28039.log
> Compiled method (c2) 913167265 4849 
> org.apache.cassandra.dht.Token::maxKeyBound (24 bytes)
>  total in heap  [0x7f7508572450,0x7f7508573318] = 3784
>  relocation [0x7f7508572570,0x7f7508572618] = 168
>  main code  [0x7f7508572620,0x7f7508572cc0] = 1696
>  stub code  [0x7f7508572cc0,0x7f7508572cf8] = 56
>  oops   [0x7f7508572cf8,0x7f7508572d90] = 152
>  scopes data[0x7f7508572d90,0x7f7508573118] = 904
>  scopes pcs [0x7f7508573118,0x7f7508573268] = 336
>  dependencies   [0x7f7508573268,0x7f7508573280] = 24
>  handler table  [0x7f7508573280,0x7f75085732e0] = 96
>  nul chk table  [0x7f75085732e0,0x7f7508573318] = 56
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9208) Setting rpc_interface in cassandra.yaml causes NPE during startup

2015-04-17 Thread Sandeep More (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep More updated CASSANDRA-9208:

Description: 
In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE 
(stack trace at the end).
Upon further investigation it turns out that there is a serious problem is in 
the way this logic is handled in the code DatabaseDescriptor.java (#374).

Following is the code snippet 
 else if (conf.rpc_interface != null)
{
listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
"rpc_interface");
}
else
{
rpcAddress = FBUtilities.getLocalAddress();
}

If you notice, 

1) The code above sets the "listenAddress" instead of "rpcAddress".  
2) The function getNetworkInterfaceAddress() blindly assumes that this is 
called to set the "listenAddress" (see line 171). The "configName" variable 
passed to the function is royally ignored and only used for printing out 
exception (which again is misleading)

I am also attaching a suggested patch (NOTE: the patch tries to address this 
issue, the function getNetworkInterfaceAddress() needs revision ).


INFO  15:36:56 Windows environment detected.  DiskAccessMode set to standard, 
indexAccessMode standard
INFO  15:36:56 Global memtable on-heap threshold is enabled at 503MB
INFO  15:36:56 Global memtable off-heap threshold is enabled at 503MB
ERROR 15:37:50 Fatal error during configuration loading
java.lang.NullPointerException: null
at 
org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:411)
 ~[apache-cassandra-2.1.4.jar:2.1.4]
at 
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
 ~[apache-cassandra-2.1.4.jar:2.1.4]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:164) 
[apache-cassandra-2.1.4.jar:2.1.4]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) 
[apache-cassandra-2.1.4.jar:2.1.4]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622) 
[apache-cassandra-2.1.4.jar:2.1.4]
null
Fatal error during configuration loading; unable to start. See log for 
stacktrace.

  was:
In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE.
Upon further investigation it turns out that there is a serious problem is in 
the way this logic is handled in the code DatabaseDescriptor.java (#374).

Following is the code snippet 
 else if (conf.rpc_interface != null)
{
listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
"rpc_interface");
}
else
{
rpcAddress = FBUtilities.getLocalAddress();
}

If you notice, 

1) The code above sets the "listenAddress" instead of "rpcAddress".  
2) The function getNetworkInterfaceAddress() blindly assumes that this is 
called to set the "listenAddress" (see line 171). The "configName" variable 
passed to the function is royally ignored and only used for printing out 
exception (which again is misleading)

I am also attaching a suggested patch (NOTE: the patch tries to address this 
issue, the function getNetworkInterfaceAddress() needs revision ).


> Setting rpc_interface in cassandra.yaml causes NPE during startup
> -
>
> Key: CASSANDRA-9208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9208
> Project: Cassandra
>  Issue Type: Bug
>  Components: Config
> Environment: Windows and RHEL
>Reporter: Sandeep More
>  Labels: easyfix, patch
> Attachments: SuggestedDataBaseDescriptor.diff
>
>
> In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE 
> (stack trace at the end).
> Upon further investigation it turns out that there is a serious problem is in 
> the way this logic is handled in the code DatabaseDescriptor.java (#374).
> Following is the code snippet 
>  else if (conf.rpc_interface != null)
> {
> listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
> "rpc_interface");
> }
> else
> {
> rpcAddress = FBUtilities.getLocalAddress();
> }
> If you notice, 
> 1) The code above sets the "listenAddress" instead of "rpcAddress".  
> 2) The function getNetworkInterfaceAddress() blindly assumes that this is 
> called to set the "listenAddress" (see line 171). The "configName" variable 
> passed to the function is royally ignored and only used for printing out 
> exception (which again is misleading)
> I am also attaching a suggested patch (NOTE: the patch tries to address this 
> issue, the function getNetworkInterfaceAddress() needs revision ).
> INFO  15:36:56 Windows environment detected.  DiskAccessMode set to standard,

[jira] [Created] (CASSANDRA-9208) Setting rpc_interface in cassandra.yaml causes NPE during startup

2015-04-17 Thread Sandeep More (JIRA)

Sandeep More created CASSANDRA-9208:
---

 Summary: Setting rpc_interface in cassandra.yaml causes NPE during 
startup
 Key: CASSANDRA-9208
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9208
 Project: Cassandra
  Issue Type: Bug
  Components: Config
 Environment: Windows and RHEL
Reporter: Sandeep More
 Attachments: SuggestedDataBaseDescriptor.diff

In the cassandra.yaml file when "rpc_interface" option is set it causes a NPE.
Upon further investigation it turns out that there is a serious problem is in 
the way this logic is handled in the code DatabaseDescriptor.java (#374).

Following is the code snippet 
 else if (conf.rpc_interface != null)
{
listenAddress = getNetworkInterfaceAddress(conf.rpc_interface, 
"rpc_interface");
}
else
{
rpcAddress = FBUtilities.getLocalAddress();
}

If you notice, 

1) The code above sets the "listenAddress" instead of "rpcAddress".  
2) The function getNetworkInterfaceAddress() blindly assumes that this is 
called to set the "listenAddress" (see line 171). The "configName" variable 
passed to the function is royally ignored and only used for printing out 
exception (which again is misleading)

I am also attaching a suggested patch (NOTE: the patch tries to address this 
issue, the function getNetworkInterfaceAddress() needs revision ).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9204) AssertionError in CompactionExecutor thread

2015-04-17 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-9204:
---

Assignee: Benedict  (was: Marcus Eriksson)

> AssertionError in CompactionExecutor thread
> ---
>
> Key: CASSANDRA-9204
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9204
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Philip Thompson
>Assignee: Benedict
> Fix For: 3.0
>
> Attachments: node1.log
>
>
> While running the dtest 
> {{upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.upgrade_test}},
>  the test is failing due to a large number of exceptions in the logs related 
> to compaction. Here is a snippet of one. The full log is attached. These 
> exceptions occurred after upgrading to trunk. The cluster had already 
> upgraded 1.2 -> 2.0 -> 2.1 successfully.
> {code}
> ERROR [CompactionExecutor:2] 2015-04-16 12:05:11,747 Cassandra
> Daemon.java: Exception in thread Thread[CompactionExecutor:2,1
> ,main]
> java.lang.AssertionError: null
> at org.apache.cassandra.io.sstable.format.SSTableReade
> r.setReplacedBy(SSTableReader.java:905) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ishAndMaybeThrow(SSTableRewriter.java:461) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:418) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:398) ~[main/:na]
> at org.apache.cassandra.db.compaction.writers.DefaultC
> ompactionWriter.finish(DefaultCompactionWriter.java:77) ~[main
> /:na]
> at org.apache.cassandra.db.compaction.CompactionTask.r
> unMayThrow(CompactionTask.java:202) ~[main/:na]
> at org.apache.cassandra.utils.WrappedRunnable.run(Wrap
> pedRunnable.java:28) ~[main/:na]
> at org.apache.cassandra.db.compaction.CompactionTask.e
> xecuteInternal(CompactionTask.java:73) ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:58)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:371)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:280)
>  ~[main/:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9204) AssertionError in CompactionExecutor thread

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500072#comment-14500072
 ] 

Benedict commented on CASSANDRA-9204:
-

I'll take the ticket, but defer it until we can get CASSANDRA-8948 and 
CASSANDRA-8568 in since, as you say, they change behaviour pretty substantially 
and very likely fix it.

> AssertionError in CompactionExecutor thread
> ---
>
> Key: CASSANDRA-9204
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9204
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Philip Thompson
>Assignee: Benedict
> Fix For: 3.0
>
> Attachments: node1.log
>
>
> While running the dtest 
> {{upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.upgrade_test}},
>  the test is failing due to a large number of exceptions in the logs related 
> to compaction. Here is a snippet of one. The full log is attached. These 
> exceptions occurred after upgrading to trunk. The cluster had already 
> upgraded 1.2 -> 2.0 -> 2.1 successfully.
> {code}
> ERROR [CompactionExecutor:2] 2015-04-16 12:05:11,747 Cassandra
> Daemon.java: Exception in thread Thread[CompactionExecutor:2,1
> ,main]
> java.lang.AssertionError: null
> at org.apache.cassandra.io.sstable.format.SSTableReade
> r.setReplacedBy(SSTableReader.java:905) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ishAndMaybeThrow(SSTableRewriter.java:461) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:418) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:398) ~[main/:na]
> at org.apache.cassandra.db.compaction.writers.DefaultC
> ompactionWriter.finish(DefaultCompactionWriter.java:77) ~[main
> /:na]
> at org.apache.cassandra.db.compaction.CompactionTask.r
> unMayThrow(CompactionTask.java:202) ~[main/:na]
> at org.apache.cassandra.utils.WrappedRunnable.run(Wrap
> pedRunnable.java:28) ~[main/:na]
> at org.apache.cassandra.db.compaction.CompactionTask.e
> xecuteInternal(CompactionTask.java:73) ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:58)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:371)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:280)
>  ~[main/:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9204) AssertionError in CompactionExecutor thread

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500072#comment-14500072
 ] 

Benedict edited comment on CASSANDRA-9204 at 4/17/15 3:46 PM:
--

I'll take the ticket, but defer it until we can get CASSANDRA-8984 and 
CASSANDRA-8568 in since, as you say, they change behaviour pretty substantially 
and very likely fix it.


was (Author: benedict):
I'll take the ticket, but defer it until we can get CASSANDRA-8948 and 
CASSANDRA-8568 in since, as you say, they change behaviour pretty substantially 
and very likely fix it.

> AssertionError in CompactionExecutor thread
> ---
>
> Key: CASSANDRA-9204
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9204
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Philip Thompson
>Assignee: Benedict
> Fix For: 3.0
>
> Attachments: node1.log
>
>
> While running the dtest 
> {{upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.upgrade_test}},
>  the test is failing due to a large number of exceptions in the logs related 
> to compaction. Here is a snippet of one. The full log is attached. These 
> exceptions occurred after upgrading to trunk. The cluster had already 
> upgraded 1.2 -> 2.0 -> 2.1 successfully.
> {code}
> ERROR [CompactionExecutor:2] 2015-04-16 12:05:11,747 Cassandra
> Daemon.java: Exception in thread Thread[CompactionExecutor:2,1
> ,main]
> java.lang.AssertionError: null
> at org.apache.cassandra.io.sstable.format.SSTableReade
> r.setReplacedBy(SSTableReader.java:905) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ishAndMaybeThrow(SSTableRewriter.java:461) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:418) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:398) ~[main/:na]
> at org.apache.cassandra.db.compaction.writers.DefaultC
> ompactionWriter.finish(DefaultCompactionWriter.java:77) ~[main
> /:na]
> at org.apache.cassandra.db.compaction.CompactionTask.r
> unMayThrow(CompactionTask.java:202) ~[main/:na]
> at org.apache.cassandra.utils.WrappedRunnable.run(Wrap
> pedRunnable.java:28) ~[main/:na]
> at org.apache.cassandra.db.compaction.CompactionTask.e
> xecuteInternal(CompactionTask.java:73) ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:58)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:371)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:280)
>  ~[main/:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500067#comment-14500067
 ] 

Brandon Williams commented on CASSANDRA-9206:
-

For testing convergence time, I recommend starting a large-ish cluster except 
for one node, then starting that node with join_ring=false.  Once everything is 
settled, call nodetool join on the node and then examine the deltas on the logs 
between when the node joined and all the nodes saw it.

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9206) Remove seed gossip probability

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-9206:

   Tester: Philip Thompson
Fix Version/s: 2.1.5

> Remove seed gossip probability
> --
>
> Key: CASSANDRA-9206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 2.1.5
>
> Attachments: 9206.txt
>
>
> Currently, we use probability to determine whether a node will gossip with a 
> seed:
> {noformat} 
> double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
> double randDbl = random.nextDouble();
> if (randDbl <= probability)
> sendGossip(prod, seeds);
> {noformat}
> I propose that we remove this probability, and instead *always* gossip with a 
> seed.  This of course means increased traffic and processing on the seed(s), 
> but even a 1000 node cluster with a single seed will only put ~1000 messages 
> per second on the seed, which is virtually nothing.  Should it become a 
> problem, the solution is simple: add more seeds.  Since seeds will also 
> always gossip with each other, this effectively gives us a poor man's 
> spanning tree, with the only cost being removing a few lines of code, and 
> should greatly improve our gossip convergence time, especially in large 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Robert Stupp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500058#comment-14500058
 ] 

Robert Stupp commented on CASSANDRA-9131:
-

bq. problem with server-side timestamps is unsafe retry by clients in the event 
of a failure.
got it

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7409) Allow multiple overlapping sstables in L1

2015-04-17 Thread Alan Boudreault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500053#comment-14500053
 ] 

Alan Boudreault commented on CASSANDRA-7409:


All scenario with basic patterns have been run. Same url than above.

> Allow multiple overlapping sstables in L1
> -
>
> Key: CASSANDRA-7409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7409
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>Assignee: Carl Yeksigian
>  Labels: compaction
> Fix For: 3.0
>
>
> Currently, when a normal L0 compaction takes place (not STCS), we take up to 
> MAX_COMPACTING_L0 L0 sstables and all of the overlapping L1 sstables and 
> compact them together. If we didn't have to deal with the overlapping L1 
> tables, we could compact a higher number of L0 sstables together into a set 
> of non-overlapping L1 sstables.
> This could be done by delaying the invariant that L1 has no overlapping 
> sstables. Going from L1 to L2, we would be compacting fewer sstables together 
> which overlap.
> When reading, we will not have the same one sstable per level (except L0) 
> guarantee, but this can be bounded (once we have too many sets of sstables, 
> either compact them back into the same level, or compact them up to the next 
> level).
> This could be generalized to allow any level to be the maximum for this 
> overlapping strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500011#comment-14500011
 ] 

Benedict commented on CASSANDRA-9131:
-

The problem with server-side timestamps is unsafe retry by clients in the event 
of a failure. CASSANDRA-6106 is a reference point, in that this provided both a 
wrapper around microsecond resolution as well as a staggered application of 
shifts in system clock time (also ensuring it never went backwards).

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8967) Allow RolesCache to be invalidated

2015-04-17 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-8967:

Attachment: 8967.txt

Patch to add JMX methods for this to RolesCache.

> Allow RolesCache to be invalidated
> --
>
> Key: CASSANDRA-8967
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8967
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Fix For: 3.0
>
> Attachments: 8967.txt
>
>
> Much like CASSANDRA-8722, we should add this to RolesCache as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Robert Stupp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1453#comment-1453
 ] 

Robert Stupp commented on CASSANDRA-9131:
-

Hm. I'm not convinced of client side timestamps. IMO maintaining timestamps on 
the clients (either these are ”fat clients” or other servers) just moves the 
problem to an area that might not be sensitive for correct system wall clock 
(e.g. NTP). I've seen operations handling Win and Linux environments completely 
separate - but both worlds with a constant time drift of several minutes (not 
funny). I'm not completely against client provided timestamps - but would 
prefer to make that an optional feature (i.e. move {{TIMESTAMP xx}} to the 
protocol).

TL;DR just want to throw in an idea: We could encapsulate 
{{System.currentTimeMillis()}} - if we detect that the clock went backwards, we 
slow down ”our system clock” and vice versa if the clock moves forwards.

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8766) SSTableRewriter opens all sstables as early before completing the compaction

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1454#comment-1454
 ] 

Benedict commented on CASSANDRA-8766:
-

I think this might be superceded by CASSANDRA-7066. So we should perhaps defer 
looking at this until that's in.

> SSTableRewriter opens all sstables as early before completing the compaction
> 
>
> Key: CASSANDRA-8766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.1.5
>
>
> In CASSANDRA-8320, we made the rewriter call switchWriter() inside of 
> finish(); in CASSANDRA-8124 was made switchWriter() open its data as EARLY. 
> This combination means we no longer honour disabling of early opening, which 
> is potentially a problem on windows for the deletion of the contents (which 
> is why we disable early opening on Windows).
> I've commented on CASSANDRA-8124, as I suspect I'm missing something about 
> this. Although I have no doubt the old behaviour of opening as TMP file 
> reduced the window for problems, and opening as TMPLINK now does the same, 
> it's not entirely clear to me its the right fix (though it may be) since we 
> shouldn't be susceptible to this window anyway? Either way, we perhaps need 
> to come up with something else, because this could potentially break windows 
> support. Perhaps if we simply did not swap in the TMPLINK file so that it 
> never actually get mapped, it would perhaps be enough. [~JoshuaMcKenzie], 
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8766) SSTableRewriter opens all sstables as early before completing the compaction

2015-04-17 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8766:
---
Assignee: Joshua McKenzie

> SSTableRewriter opens all sstables as early before completing the compaction
> 
>
> Key: CASSANDRA-8766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Joshua McKenzie
>Priority: Minor
> Fix For: 2.1.5
>
>
> In CASSANDRA-8320, we made the rewriter call switchWriter() inside of 
> finish(); in CASSANDRA-8124 was made switchWriter() open its data as EARLY. 
> This combination means we no longer honour disabling of early opening, which 
> is potentially a problem on windows for the deletion of the contents (which 
> is why we disable early opening on Windows).
> I've commented on CASSANDRA-8124, as I suspect I'm missing something about 
> this. Although I have no doubt the old behaviour of opening as TMP file 
> reduced the window for problems, and opening as TMPLINK now does the same, 
> it's not entirely clear to me its the right fix (though it may be) since we 
> shouldn't be susceptible to this window anyway? Either way, we perhaps need 
> to come up with something else, because this could potentially break windows 
> support. Perhaps if we simply did not swap in the TMPLINK file so that it 
> never actually get mapped, it would perhaps be enough. [~JoshuaMcKenzie], 
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9204) AssertionError in CompactionExecutor thread

2015-04-17 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499968#comment-14499968
 ] 

Marcus Eriksson commented on CASSANDRA-9204:


[~benedict] could you have a look? Guessing this will be fixed by CASSANDRA-8568

> AssertionError in CompactionExecutor thread
> ---
>
> Key: CASSANDRA-9204
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9204
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Philip Thompson
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
> Attachments: node1.log
>
>
> While running the dtest 
> {{upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.upgrade_test}},
>  the test is failing due to a large number of exceptions in the logs related 
> to compaction. Here is a snippet of one. The full log is attached. These 
> exceptions occurred after upgrading to trunk. The cluster had already 
> upgraded 1.2 -> 2.0 -> 2.1 successfully.
> {code}
> ERROR [CompactionExecutor:2] 2015-04-16 12:05:11,747 Cassandra
> Daemon.java: Exception in thread Thread[CompactionExecutor:2,1
> ,main]
> java.lang.AssertionError: null
> at org.apache.cassandra.io.sstable.format.SSTableReade
> r.setReplacedBy(SSTableReader.java:905) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ishAndMaybeThrow(SSTableRewriter.java:461) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:418) ~[main/:na]
> at org.apache.cassandra.io.sstable.SSTableRewriter.fin
> ish(SSTableRewriter.java:398) ~[main/:na]
> at org.apache.cassandra.db.compaction.writers.DefaultC
> ompactionWriter.finish(DefaultCompactionWriter.java:77) ~[main
> /:na]
> at org.apache.cassandra.db.compaction.CompactionTask.r
> unMayThrow(CompactionTask.java:202) ~[main/:na]
> at org.apache.cassandra.utils.WrappedRunnable.run(Wrap
> pedRunnable.java:28) ~[main/:na]
> at org.apache.cassandra.db.compaction.CompactionTask.e
> xecuteInternal(CompactionTask.java:73) ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:58)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:371)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:280)
>  ~[main/:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499963#comment-14499963
 ] 

Benedict commented on CASSANDRA-9131:
-

I can't recall entirely, but I don't think that was quite the way the 
conversation resolved. It's not exactly "expected" behaviour on either side, 
but obviously there's nothing we can do about clients that are subjected to 
this bug, and we intend to deprecate server-side timestamps (if perhaps never 
eliminate them entirely). So the question is

# if we should fix server side timestamps by making the clock universally 
monotonically increasing (as opposed to only per-client connection) which would 
at least somewhat mitigate this problem (but not eliminate it entirely, for any 
sequence of inserts hitting multiple servers)
# if we should update the client protocol spec/docs to make clear that this 
problem exists, and that clients are expected to work around it
## if we should offer some sample code, and work with the Java Driver team to 
ensure this problem doesn't affect it

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer

2015-04-17 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499960#comment-14499960
 ] 

Ariel Weisberg commented on CASSANDRA-8723:
---

[~jeffl] If this is something that you are reproducing regularly in a test 
environment it would help to get a heap dump some time before the process dies.

Maybe run a script in the background that checks whether CassandraDaemon is 
running and dumps the heap every few minutes. We can look at what native 
allocations exist via the heap dump since we wrap them all with POJOs. 

> Cassandra 2.1.2 Memory issue - java process memory usage continuously 
> increases until process is killed by OOM killer
> -
>
> Key: CASSANDRA-8723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Liu
> Fix For: 2.1.5
>
> Attachments: cassandra.yaml
>
>
> Issue:
> We have an on-going issue with cassandra nodes running with continuously 
> increasing memory until killed by OOM.
> {noformat}
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill 
> process 13919 (java) score 911 or sacrifice child
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 
> (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB
> {noformat}
> System Profile:
> cassandra version 2.1.2
> system: aws c1.xlarge instance with 8 cores, 7.1G memory.
> cassandra jvm:
> -Xms1792M -Xmx1792M -Xmn400M -Xss256k
> {noformat}
> java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar 
> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M 
> -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC 
> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime 
> -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M 
> -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60
>  -Dlogback.configurationFile=logback.xml 
> -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= 
> -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp 
> /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar:
>  -XX:HeapDumpPath=/var/lib/cassandra/java_1421511

cassandra git commit: Re-add cold_reads_to_omit param for backwards compatibility

2015-04-17 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 df014036b -> 5d88ff4e4


Re-add cold_reads_to_omit param for backwards compatibility

Patch by Tommy Stendahl; reviewed by marcuse for CASSANDRA-9203


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5d88ff4e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5d88ff4e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5d88ff4e

Branch: refs/heads/cassandra-2.1
Commit: 5d88ff4e41210f95d0e3e53ded779765b0136c2a
Parents: df01403
Author: Tommy Stendahl 
Authored: Fri Apr 17 16:47:33 2015 +0200
Committer: Marcus Eriksson 
Committed: Fri Apr 17 16:47:33 2015 +0200

--
 CHANGES.txt   | 1 +
 .../db/compaction/SizeTieredCompactionStrategyOptions.java| 3 +++
 2 files changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d88ff4e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 74ec921..80ab11c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.5
+ * Re-add deprecated cold_reads_to_omit param for backwards compat 
(CASSANDRA-9203)
  * Make anticompaction visible in compactionstats (CASSANDRA-9098)
  * Improve nodetool getendpoints documentation about the partition
key parameter (CASSANDRA-6458)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d88ff4e/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
index 911bb9f..9a840e1 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
@@ -29,6 +29,8 @@ public final class SizeTieredCompactionStrategyOptions
 protected static final String MIN_SSTABLE_SIZE_KEY = "min_sstable_size";
 protected static final String BUCKET_LOW_KEY = "bucket_low";
 protected static final String BUCKET_HIGH_KEY = "bucket_high";
+@Deprecated
+protected static final String COLD_READS_TO_OMIT_KEY = 
"cold_reads_to_omit";
 
 protected long minSSTableSize;
 protected double bucketLow;
@@ -91,6 +93,7 @@ public final class SizeTieredCompactionStrategyOptions
 uncheckedOptions.remove(MIN_SSTABLE_SIZE_KEY);
 uncheckedOptions.remove(BUCKET_LOW_KEY);
 uncheckedOptions.remove(BUCKET_HIGH_KEY);
+uncheckedOptions.remove(COLD_READS_TO_OMIT_KEY);
 
 return uncheckedOptions;
 }

[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-04-17 Thread marcuse

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0f72f79d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0f72f79d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0f72f79d

Branch: refs/heads/trunk
Commit: 0f72f79d5f9ed54cb9b9e33d371f4f13eae21dca
Parents: 4adf29d 5d88ff4
Author: Marcus Eriksson 
Authored: Fri Apr 17 16:49:07 2015 +0200
Committer: Marcus Eriksson 
Committed: Fri Apr 17 16:49:07 2015 +0200

--

--

[1/2] cassandra git commit: Re-add cold_reads_to_omit param for backwards compatibility

2015-04-17 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/trunk 4adf29d4e -> 0f72f79d5


Re-add cold_reads_to_omit param for backwards compatibility

Patch by Tommy Stendahl; reviewed by marcuse for CASSANDRA-9203


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5d88ff4e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5d88ff4e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5d88ff4e

Branch: refs/heads/trunk
Commit: 5d88ff4e41210f95d0e3e53ded779765b0136c2a
Parents: df01403
Author: Tommy Stendahl 
Authored: Fri Apr 17 16:47:33 2015 +0200
Committer: Marcus Eriksson 
Committed: Fri Apr 17 16:47:33 2015 +0200

--
 CHANGES.txt   | 1 +
 .../db/compaction/SizeTieredCompactionStrategyOptions.java| 3 +++
 2 files changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d88ff4e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 74ec921..80ab11c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.5
+ * Re-add deprecated cold_reads_to_omit param for backwards compat 
(CASSANDRA-9203)
  * Make anticompaction visible in compactionstats (CASSANDRA-9098)
  * Improve nodetool getendpoints documentation about the partition
key parameter (CASSANDRA-6458)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d88ff4e/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
index 911bb9f..9a840e1 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategyOptions.java
@@ -29,6 +29,8 @@ public final class SizeTieredCompactionStrategyOptions
 protected static final String MIN_SSTABLE_SIZE_KEY = "min_sstable_size";
 protected static final String BUCKET_LOW_KEY = "bucket_low";
 protected static final String BUCKET_HIGH_KEY = "bucket_high";
+@Deprecated
+protected static final String COLD_READS_TO_OMIT_KEY = 
"cold_reads_to_omit";
 
 protected long minSSTableSize;
 protected double bucketLow;
@@ -91,6 +93,7 @@ public final class SizeTieredCompactionStrategyOptions
 uncheckedOptions.remove(MIN_SSTABLE_SIZE_KEY);
 uncheckedOptions.remove(BUCKET_LOW_KEY);
 uncheckedOptions.remove(BUCKET_HIGH_KEY);
+uncheckedOptions.remove(COLD_READS_TO_OMIT_KEY);
 
 return uncheckedOptions;
 }

[jira] [Commented] (CASSANDRA-8718) nodetool cleanup causes segfault

2015-04-17 Thread Philip Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499932#comment-14499932
 ] 

Philip Thompson commented on CASSANDRA-8718:


We've tested cleanup on both of those JDK's, and not have encountered a similar 
issue. [~JoshuaMcKenzie], does the attached log give you any info that may help 
us reproduce the issue?

> nodetool cleanup causes segfault
> 
>
> Key: CASSANDRA-8718
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8718
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Maxim Ivanov
>Priority: Minor
> Fix For: 2.0.15
>
> Attachments: java_hs_err.log
>
>
> When doing cleanup on C* 2.0.12 following error crashes the java process:
> {code}
>  INFO 17:59:02,800 Cleaning up 
> SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db')
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 
> 1.7.0_71-b14)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # J 2655 C2 
> org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I
>  (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /var/lib/cassandra_prod/hs_err_pid28039.log
> Compiled method (c2) 913167265 4849 
> org.apache.cassandra.dht.Token::maxKeyBound (24 bytes)
>  total in heap  [0x7f7508572450,0x7f7508573318] = 3784
>  relocation [0x7f7508572570,0x7f7508572618] = 168
>  main code  [0x7f7508572620,0x7f7508572cc0] = 1696
>  stub code  [0x7f7508572cc0,0x7f7508572cf8] = 56
>  oops   [0x7f7508572cf8,0x7f7508572d90] = 152
>  scopes data[0x7f7508572d90,0x7f7508573118] = 904
>  scopes pcs [0x7f7508573118,0x7f7508573268] = 336
>  dependencies   [0x7f7508573268,0x7f7508573280] = 24
>  handler table  [0x7f7508573280,0x7f75085732e0] = 96
>  nul chk table  [0x7f75085732e0,0x7f7508573318] = 56
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8795) Cassandra (possibly under load) occasionally throws an exception during CQL create table

2015-04-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499928#comment-14499928
 ] 

Benedict commented on CASSANDRA-8795:
-

[~philipthompson] good question. [~iamaleksey]? You're the closest to a "domain 
expert" on schema stuff that I can think of. Do these things fall under the 
purview of "CQL" [~slebresne]? [~driftx] maybe? This stuff all predates me, so 
I'm just throwing up the bat signal to the old timers really.

> Cassandra (possibly under load) occasionally throws an exception during CQL 
> create table
> 
>
> Key: CASSANDRA-8795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8795
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Darren Warner
> Fix For: 2.1.5
>
>
> CQLSH will return the following:
> {code}
> { name: 'ResponseError',
>   message: 'java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: java.lang.NullPointerException',
>   info: 'Represents an error message from the server',
>  code: 0,
>  query: 'CREATE TABLE IF NOT EXISTS roles_by_users( userid TIMEUUID, role 
> INT, entityid TIMEUUID, entity_type TEXT, enabled BOOLEAN, PRIMARY KEY 
> (userid, role, entityid, entity_type) );' }
> {code}
> Cassandra system.log shows:
> {code}
> ERROR [MigrationStage:1] 2015-02-11 14:38:48,610 CassandraDaemon.java:153 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.db.DefsTables.addColumnFamily(DefsTables.java:371) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:293) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:194) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:166) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:393)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_31]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_31]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_31]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_31]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
> ERROR [SharedPool-Worker-2] 2015-02-11 14:38:48,620 QueryMessage.java:132 - 
> Unexpected error during query
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.NullPointerException
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:398) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:374)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager.announceNewColumnFamily(MigrationManager.java:249)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.statements.CreateTableStatement.announceMigration(CreateTableStatement.java:113)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:80)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.2.jar:2.1.2]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [

[jira] [Commented] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer

2015-04-17 Thread Philip Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499929#comment-14499929
 ] 

Philip Thompson commented on CASSANDRA-8723:


[~jeffl], have you had the opportunity to try 2.1.3 yet? What was 2.1.4 will 
now be 2.1.5, which has a tentative release tag and should be out in 
approximately a week. If upgrading fixes your issue, please let us know.

> Cassandra 2.1.2 Memory issue - java process memory usage continuously 
> increases until process is killed by OOM killer
> -
>
> Key: CASSANDRA-8723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Liu
> Fix For: 2.1.5
>
> Attachments: cassandra.yaml
>
>
> Issue:
> We have an on-going issue with cassandra nodes running with continuously 
> increasing memory until killed by OOM.
> {noformat}
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill 
> process 13919 (java) score 911 or sacrifice child
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 
> (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB
> {noformat}
> System Profile:
> cassandra version 2.1.2
> system: aws c1.xlarge instance with 8 cores, 7.1G memory.
> cassandra jvm:
> -Xms1792M -Xmx1792M -Xmn400M -Xss256k
> {noformat}
> java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar 
> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M 
> -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC 
> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime 
> -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M 
> -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60
>  -Dlogback.configurationFile=logback.xml 
> -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= 
> -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp 
> /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar:
>  -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof 
> -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log 
> org.apache.cassandra.service.CassandraDaemon
> {noformat}



--
This message was sent

[jira] [Updated] (CASSANDRA-8741) Running a drain before a decommission apparently the wrong thing to do

2015-04-17 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8741:
---
Fix Version/s: 2.1.5
   2.0.15

> Running a drain before a decommission apparently the wrong thing to do
> --
>
> Key: CASSANDRA-8741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8741
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.04; Cassandra 2.0.11.82 (Datastax Enterprise 
> 4.5.3)
>Reporter: Casey Marshall
>Priority: Trivial
>  Labels: lhf
> Fix For: 2.0.15, 2.1.5
>
>
> This might simply be a documentation issue. It appears that running "nodetool 
> drain" is a very wrong thing to do before running a "nodetool decommission".
> The idea was that I was going to safely shut off writes and flush everything 
> to disk before beginning the decommission. What happens is the "decommission" 
> call appears to fail very early on after starting, and afterwards, the node 
> in question is stuck in state LEAVING, but all other nodes in the ring see 
> that node as NORMAL, but down. No streams are ever sent from the node being 
> decommissioned to other nodes.
> The drain command does indeed shut down the "BatchlogTasks" executor 
> (org/apache/cassandra/service/StorageService.java, line 3445 in git tag 
> "cassandra-2.0.11") but the decommission process tries using that executor 
> when calling the "startBatchlogReplay" function 
> (org/apache/cassandra/db/BatchlogManager.java, line 123) called through 
> org.apache.cassandra.service.StorageService.unbootstrap (see the stack trace 
> pasted below).
> This also failed in a similar way on Cassandra 1.2.13-ish (DSE 3.2.4).
> So, either something is wrong with the drain/decommission commands, or it's 
> very wrong to run a drain before a decommission. What's worse, there seems to 
> be no way to recover this node once it is in this state; you need to shut it 
> down and run "removenode".
> My terminal output:
> {code}
> ubuntu@x:~$ nodetool drain
> ubuntu@x:~$ tail /var/log/^C
> ubuntu@x:~$ nodetool decommission
> Exception in thread "main" java.util.concurrent.RejectedExecutionException: 
> Task 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@3008fa33 
> rejected from 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1d6242e8[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 52]
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629)
> at 
> org.apache.cassandra.db.BatchlogManager.startBatchlogReplay(BatchlogManager.java:123)
> at 
> org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2966)
> at 
> org.apache.cassandra.service.StorageService.decommission(StorageService.java:2934)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doO

[jira] [Updated] (CASSANDRA-8741) Running a drain before a decommission apparently the wrong thing to do

2015-04-17 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8741:
---
Description: 
This might simply be a documentation issue. It appears that running "nodetool 
drain" is a very wrong thing to do before running a "nodetool decommission".

The idea was that I was going to safely shut off writes and flush everything to 
disk before beginning the decommission. What happens is the "decommission" call 
appears to fail very early on after starting, and afterwards, the node in 
question is stuck in state LEAVING, but all other nodes in the ring see that 
node as NORMAL, but down. No streams are ever sent from the node being 
decommissioned to other nodes.

The drain command does indeed shut down the "BatchlogTasks" executor 
(org/apache/cassandra/service/StorageService.java, line 3445 in git tag 
"cassandra-2.0.11") but the decommission process tries using that executor when 
calling the "startBatchlogReplay" function 
(org/apache/cassandra/db/BatchlogManager.java, line 123) called through 
org.apache.cassandra.service.StorageService.unbootstrap (see the stack trace 
pasted below).

This also failed in a similar way on Cassandra 1.2.13-ish (DSE 3.2.4).

So, either something is wrong with the drain/decommission commands, or it's 
very wrong to run a drain before a decommission. What's worse, there seems to 
be no way to recover this node once it is in this state; you need to shut it 
down and run "removenode".

My terminal output:
{code}
ubuntu@x:~$ nodetool drain
ubuntu@x:~$ tail /var/log/^C
ubuntu@x:~$ nodetool decommission
Exception in thread "main" java.util.concurrent.RejectedExecutionException: 
Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@3008fa33 
rejected from 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1d6242e8[Terminated,
 pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 52]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629)
at 
org.apache.cassandra.db.BatchlogManager.startBatchlogReplay(BatchlogManager.java:123)
at 
org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2966)
at 
org.apache.cassandra.service.StorageService.decommission(StorageService.java:2934)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java

[jira] [Commented] (CASSANDRA-8795) Cassandra (possibly under load) occasionally throws an exception during CQL create table

2015-04-17 Thread Philip Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499923#comment-14499923
 ] 

Philip Thompson commented on CASSANDRA-8795:


[~benedict], who should handle fixing the problem in MigrationManager?

> Cassandra (possibly under load) occasionally throws an exception during CQL 
> create table
> 
>
> Key: CASSANDRA-8795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8795
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Darren Warner
> Fix For: 2.1.5
>
>
> CQLSH will return the following:
> {code}
> { name: 'ResponseError',
>   message: 'java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: java.lang.NullPointerException',
>   info: 'Represents an error message from the server',
>  code: 0,
>  query: 'CREATE TABLE IF NOT EXISTS roles_by_users( userid TIMEUUID, role 
> INT, entityid TIMEUUID, entity_type TEXT, enabled BOOLEAN, PRIMARY KEY 
> (userid, role, entityid, entity_type) );' }
> {code}
> Cassandra system.log shows:
> {code}
> ERROR [MigrationStage:1] 2015-02-11 14:38:48,610 CassandraDaemon.java:153 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.db.DefsTables.addColumnFamily(DefsTables.java:371) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:293) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:194) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:166) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:393)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_31]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_31]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_31]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_31]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
> ERROR [SharedPool-Worker-2] 2015-02-11 14:38:48,620 QueryMessage.java:132 - 
> Unexpected error during query
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.NullPointerException
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:398) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:374)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.service.MigrationManager.announceNewColumnFamily(MigrationManager.java:249)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.statements.CreateTableStatement.announceMigration(CreateTableStatement.java:113)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:80)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:248) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.2.jar:2.1.2]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java

[jira] [Commented] (CASSANDRA-8798) don't throw TombstoneOverwhelmingException during bootstrap

2015-04-17 Thread Philip Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499921#comment-14499921
 ] 

Philip Thompson commented on CASSANDRA-8798:


[~aweisberg], you want to look over Jeff's proposed patch?

> don't throw TombstoneOverwhelmingException during bootstrap
> ---
>
> Key: CASSANDRA-8798
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8798
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: mck
>Assignee: Jeff Jirsa
> Fix For: 2.0.15
>
> Attachments: 8798.txt
>
>
> During bootstrap honouring tombstone_failure_threshold seems 
> counter-productive as the node is not serving requests so not protecting 
> anything.
> Instead what happens is bootstrap fails, and a cluster that obviously needs 
> an extra node isn't getting it...
> **History**
> When adding a new node bootstrap process looks complete in that streaming is 
> finished, compactions finished, and all disk and cpu activity is calm.
> But the node is still stuck in "joining" status. 
> The last stage in the bootstrapping process is the rebuilding of secondary 
> indexes. grepping the logs confirmed it failed during this stage.
> {code}grep SecondaryIndexManager cassandra/logs/*{code}
> To see what secondary index rebuilding was initiated
> {code}
> grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}'
> INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex
> INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX
> INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, 
> events.real_tbIndex]
> {code}
> To get an idea of successful secondary index rebuilding 
> {code}grep "Index build of "cassandra/logs/*
> INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete
> INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete
> {code}
> Looking closer at  {{[events.collected_tbIndex, events.real_tbIndex]}} showed 
> the following stacktrace
> {code}
> ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 CassandraDaemon.java 
> (line 199) Exception in thread Thread[StreamReceiveTask:121,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:142)
> at 
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:130)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409)
> ... 7 more
> Caused by: java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:160)
> at 
> org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:143)
> at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:406)
> at 
> org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:834)
> ... 5 more
> Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
> at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
> at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
> at 
> org.

[jira] [Updated] (CASSANDRA-8798) don't throw TombstoneOverwhelmingException during bootstrap

2015-04-17 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8798:
---
Assignee: Jeff Jirsa

> don't throw TombstoneOverwhelmingException during bootstrap
> ---
>
> Key: CASSANDRA-8798
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8798
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: mck
>Assignee: Jeff Jirsa
> Fix For: 2.0.15
>
> Attachments: 8798.txt
>
>
> During bootstrap honouring tombstone_failure_threshold seems 
> counter-productive as the node is not serving requests so not protecting 
> anything.
> Instead what happens is bootstrap fails, and a cluster that obviously needs 
> an extra node isn't getting it...
> **History**
> When adding a new node bootstrap process looks complete in that streaming is 
> finished, compactions finished, and all disk and cpu activity is calm.
> But the node is still stuck in "joining" status. 
> The last stage in the bootstrapping process is the rebuilding of secondary 
> indexes. grepping the logs confirmed it failed during this stage.
> {code}grep SecondaryIndexManager cassandra/logs/*{code}
> To see what secondary index rebuilding was initiated
> {code}
> grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}'
> INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex
> INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX
> INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, 
> events.real_tbIndex]
> {code}
> To get an idea of successful secondary index rebuilding 
> {code}grep "Index build of "cassandra/logs/*
> INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete
> INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete
> {code}
> Looking closer at  {{[events.collected_tbIndex, events.real_tbIndex]}} showed 
> the following stacktrace
> {code}
> ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 CassandraDaemon.java 
> (line 199) Exception in thread Thread[StreamReceiveTask:121,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:142)
> at 
> org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:130)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409)
> ... 7 more
> Caused by: java.lang.RuntimeException: 
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:160)
> at 
> org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:143)
> at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:406)
> at 
> org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:834)
> ... 5 more
> Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> at 
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
> at 
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
> at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
> at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
>

[jira] [Updated] (CASSANDRA-9131) Defining correct behavior during leap second insertion

2015-04-17 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9131:
---
Assignee: Jim Witschey

[~mambocab], should this be closed as "Not a Problem", based on irc discussion?

> Defining correct behavior during leap second insertion
> --
>
> Key: CASSANDRA-9131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9131
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue 
> Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Jim Witschey
>Assignee: Jim Witschey
>
> On Linux platforms, the insertion of a leap second breaks the monotonicity of 
> timestamps. This can make values appear to have been inserted into Cassandra 
> in a different order than they were. I want to know what behavior is expected 
> and desirable for inserts over this discontinuity.
> From a timestamp perspective, an inserted leap second looks like a repeat of 
> the previous second:
> {code}
> $ while true ; do echo "`date +%s%N` `date -u`" ; sleep .5 ; done
> 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015
> 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015
> 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015
> 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015
> 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015
> 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015
> 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015
> 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015
> 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015
> 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015
> {code}
> Note that 23:59:59 repeats itself, and that the timestamps increase during 
> the first time through, then step back down to the beginning of the second 
> and increase again.
> As a result, the timestamps on values inserted during these seconds will be 
> out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and 
> synced them to shortly before the leap second would be inserted. During the 
> insertion of the leap second, I ran a test with logic something like:
> {code}
> simple_insert = session.prepare(
> 'INSERT INTO test (foo, bar) VALUES (?, ?);')
> for i in itertools.count():
> # stop after midnight
> now = datetime.utcnow()
> last_midnight = now.replace(hour=0, minute=0,
> second=0, microsecond=0)
> seconds_since_midnight = (now - last_midnight).total_seconds()
> if 5 <= seconds_since_midnight <= 15:
> break
> session.execute(simple_insert, [i, i])
> result = session.execute("SELECT bar, WRITETIME(bar) FROM test;")
> {code}
> EDIT: This behavior occurs with server-generated timestamps; in this 
> particular test, I set {{use_client_timestamp}} to {{False}}.
> Under normal circumstances, the values and writetimes would increase 
> together, but when inserted over the leap second, they don't. These {{value, 
> writetime}} pairs are sorted by writetime:
> {code}
> (582, 1435708799285000)
> (579, 1435708799339000)
> (583, 1435708799593000)
> (580, 1435708799643000)
> (584, 1435708799897000)
> (581, 1435708799958000)
> {code}
> The values were inserted in increasing order, but their writetimes are in a 
> different order because of the repeated second. During the first instance of 
> 23:59:59, the values 579, 580, and 581 were inserted at the beginning, 
> middle, and end of the second. During the leap second, which is also 
> 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and 
> end of the second. However, since the two seconds are the same second, they 
> appear interleaved with respect to timestamps, as shown above.
> So, should I consider this behavior correct? If not, how should Cassandra 
> correctly handle the discontinuity introduced by the insertion of a leap 
> second?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 126 matches

Mail list logo