[jira] [Commented] (CASSANDRA-13455) lose check of null strings in decoding client token

2017-04-19 Thread Amos Jianjun Kong (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975768#comment-15975768
 ] 

Amos Jianjun Kong commented on CASSANDRA-13455:
---

I agree with that empty passwords should be allowed for both 
PasswordAuthenticator and AllowAllAuthenticator.
Checking the empty username in decodeCredentials() will found the problem 
early, however PasswordAuthenticator can do it by itself.

So we can treat this issue as NOTABUG and ignore the patches. Thanks for your 
responses :-)

> lose check of null strings in decoding client token
> ---
>
> Key: CASSANDRA-13455
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13455
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS7.2
> Java 1.8
>Reporter: Amos Jianjun Kong
>Assignee: Amos Jianjun Kong
> Fix For: 3.10
>
> Attachments: 0001-auth-check-both-null-points-and-null-strings.patch, 
> 0001-auth-strictly-delimit-in-decoding-client-token.patch
>
>
> RFC4616 requests AuthZID, USERNAME, PASSWORD are delimited by single '\000'.
> Current code actually delimits by serial '\000', when username or password
> is null, it caused decoding derangement.
> The problem was found in code review.
> 
> update: above description is wrong, the problem is that :
> When client responses null strings for username or password,
> current decodeCredentials() can't identify it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12835) Tracing payload not passed from QueryMessage to tracing session

2017-04-19 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975754#comment-15975754
 ] 

mck commented on CASSANDRA-12835:
-

{quote}given a commented "+1" by the reviewer am I free to push (updating the 
commit msg to mark you as reviewer)?{quote}
looking through the commit log it would appear the answer is yes :-)

just waiting for the dtests to finish before pushing.

> Tracing payload not passed from QueryMessage to tracing session
> ---
>
> Key: CASSANDRA-12835
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12835
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Assignee: mck
>Priority: Critical
>  Labels: tracing
> Fix For: 3.11.x, 4.x
>
>
> Caused by CASSANDRA-10392.
> Related to CASSANDRA-11706.
> When querying using CQL statements (not prepared) the message type is 
> QueryMessage and the code in 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/QueryMessage.java#L101
>  is as follows:
> {code:java}
> if (state.traceNextQuery())
> {
> state.createTracingSession();
> ImmutableMap.Builder builder = 
> ImmutableMap.builder();
> {code}
> {{state.createTracingSession();}} should probably be 
> {{state.createTracingSession(getCustomPayload());}}. At least that fixes the 
> problem for me.
> This also raises the question whether some other parts of the code should 
> pass the custom payload as well (I'm not the right person to analyze this):
> {code}
> $ ag createTracingSession
> src/java/org/apache/cassandra/service/QueryState.java
> 80:public void createTracingSession()
> 82:createTracingSession(Collections.EMPTY_MAP);
> 85:public void createTracingSession(Map customPayload)
> src/java/org/apache/cassandra/thrift/CassandraServer.java
> 2528:state().getQueryState().createTracingSession();
> src/java/org/apache/cassandra/transport/messages/BatchMessage.java
> 163:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java
> 114:state.createTracingSession(getCustomPayload());
> src/java/org/apache/cassandra/transport/messages/QueryMessage.java
> 101:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/PrepareMessage.java
> 74:state.createTracingSession();
> {code}
> This is not marked as `minor` as the CASSANDRA-11706 was because this cannot 
> be fixed by the tracing plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-12835) Tracing payload not passed from QueryMessage to tracing session

2017-04-19 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972551#comment-15972551
 ] 

mck edited comment on CASSANDRA-12835 at 4/19/17 11:42 PM:
---

[~tjake], patches are updated here:
|| Branch   || Testall  || Dtest ||
| 
[3.11|https://github.com/michaelsembwever/cassandra/commit/56770c6c6a0268b9b0a2f8927df41f61e02e38f6]
  | [testall|https://circleci.com/gh/michaelsembwever/cassandra/23]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/20/]
 |
| 
[trunk|https://github.com/michaelsembwever/cassandra/commit/4ab20fdad52c6fe645e996598da225547cce973f]
 | [testall|https://circleci.com/gh/michaelsembwever/cassandra/24]  
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/19/]
 |

(dtests are queued and will likely take some time to complete)


was (Author: michaelsembwever):
[~tjake], patches are updated here:
|| Branch   || Testall  || Dtest ||
| 
[3.11|https://github.com/michaelsembwever/cassandra/commit/56770c6c6a0268b9b0a2f8927df41f61e02e38f6]
  | [testall|https://circleci.com/gh/michaelsembwever/cassandra/16]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/20/]
 |
| 
[trunk|https://github.com/michaelsembwever/cassandra/commit/4ab20fdad52c6fe645e996598da225547cce973f]
 | [testall|https://circleci.com/gh/michaelsembwever/cassandra/20]  
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/19/]
 |

(dtests are queued and will likely take some time to complete)

> Tracing payload not passed from QueryMessage to tracing session
> ---
>
> Key: CASSANDRA-12835
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12835
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Assignee: mck
>Priority: Critical
>  Labels: tracing
> Fix For: 3.11.x, 4.x
>
>
> Caused by CASSANDRA-10392.
> Related to CASSANDRA-11706.
> When querying using CQL statements (not prepared) the message type is 
> QueryMessage and the code in 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/QueryMessage.java#L101
>  is as follows:
> {code:java}
> if (state.traceNextQuery())
> {
> state.createTracingSession();
> ImmutableMap.Builder builder = 
> ImmutableMap.builder();
> {code}
> {{state.createTracingSession();}} should probably be 
> {{state.createTracingSession(getCustomPayload());}}. At least that fixes the 
> problem for me.
> This also raises the question whether some other parts of the code should 
> pass the custom payload as well (I'm not the right person to analyze this):
> {code}
> $ ag createTracingSession
> src/java/org/apache/cassandra/service/QueryState.java
> 80:public void createTracingSession()
> 82:createTracingSession(Collections.EMPTY_MAP);
> 85:public void createTracingSession(Map customPayload)
> src/java/org/apache/cassandra/thrift/CassandraServer.java
> 2528:state().getQueryState().createTracingSession();
> src/java/org/apache/cassandra/transport/messages/BatchMessage.java
> 163:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java
> 114:state.createTracingSession(getCustomPayload());
> src/java/org/apache/cassandra/transport/messages/QueryMessage.java
> 101:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/PrepareMessage.java
> 74:state.createTracingSession();
> {code}
> This is not marked as `minor` as the CASSANDRA-11706 was because this cannot 
> be fixed by the tracing plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-12835) Tracing payload not passed from QueryMessage to tracing session

2017-04-19 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972551#comment-15972551
 ] 

mck edited comment on CASSANDRA-12835 at 4/19/17 11:40 PM:
---

[~tjake], patches are updated here:
|| Branch   || Testall  || Dtest ||
| 
[3.11|https://github.com/michaelsembwever/cassandra/commit/56770c6c6a0268b9b0a2f8927df41f61e02e38f6]
  | [testall|https://circleci.com/gh/michaelsembwever/cassandra/16]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/20/]
 |
| 
[trunk|https://github.com/michaelsembwever/cassandra/commit/4ab20fdad52c6fe645e996598da225547cce973f]
 | [testall|https://circleci.com/gh/michaelsembwever/cassandra/20]  
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/19/]
 |

(dtests are queued and will likely take some time to complete)


was (Author: michaelsembwever):
[~tjake], patches are updated here:
|| Branch   || Testall  || Dtest ||
| 
[3.11|https://github.com/michaelsembwever/cassandra/commit/4105fc71c652794d3ae1fba475f01ebf00199a07]
  | [testall|https://circleci.com/gh/michaelsembwever/cassandra/16]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/20/]
 |
| 
[trunk|https://github.com/michaelsembwever/cassandra/commit/c4de4f0dd0e70d7d67ade1e315ee3053494cf51c]
 | [testall|https://circleci.com/gh/michaelsembwever/cassandra/20]  
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/19/]
 |

(dtests are queued and will likely take some time to complete)

> Tracing payload not passed from QueryMessage to tracing session
> ---
>
> Key: CASSANDRA-12835
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12835
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Assignee: mck
>Priority: Critical
>  Labels: tracing
> Fix For: 3.11.x, 4.x
>
>
> Caused by CASSANDRA-10392.
> Related to CASSANDRA-11706.
> When querying using CQL statements (not prepared) the message type is 
> QueryMessage and the code in 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/QueryMessage.java#L101
>  is as follows:
> {code:java}
> if (state.traceNextQuery())
> {
> state.createTracingSession();
> ImmutableMap.Builder builder = 
> ImmutableMap.builder();
> {code}
> {{state.createTracingSession();}} should probably be 
> {{state.createTracingSession(getCustomPayload());}}. At least that fixes the 
> problem for me.
> This also raises the question whether some other parts of the code should 
> pass the custom payload as well (I'm not the right person to analyze this):
> {code}
> $ ag createTracingSession
> src/java/org/apache/cassandra/service/QueryState.java
> 80:public void createTracingSession()
> 82:createTracingSession(Collections.EMPTY_MAP);
> 85:public void createTracingSession(Map customPayload)
> src/java/org/apache/cassandra/thrift/CassandraServer.java
> 2528:state().getQueryState().createTracingSession();
> src/java/org/apache/cassandra/transport/messages/BatchMessage.java
> 163:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java
> 114:state.createTracingSession(getCustomPayload());
> src/java/org/apache/cassandra/transport/messages/QueryMessage.java
> 101:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/PrepareMessage.java
> 74:state.createTracingSession();
> {code}
> This is not marked as `minor` as the CASSANDRA-11706 was because this cannot 
> be fixed by the tracing plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-12835) Tracing payload not passed from QueryMessage to tracing session

2017-04-19 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-12835:

Status: Ready to Commit  (was: Patch Available)

> Tracing payload not passed from QueryMessage to tracing session
> ---
>
> Key: CASSANDRA-12835
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12835
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Assignee: mck
>Priority: Critical
>  Labels: tracing
> Fix For: 3.11.x, 4.x
>
>
> Caused by CASSANDRA-10392.
> Related to CASSANDRA-11706.
> When querying using CQL statements (not prepared) the message type is 
> QueryMessage and the code in 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/QueryMessage.java#L101
>  is as follows:
> {code:java}
> if (state.traceNextQuery())
> {
> state.createTracingSession();
> ImmutableMap.Builder builder = 
> ImmutableMap.builder();
> {code}
> {{state.createTracingSession();}} should probably be 
> {{state.createTracingSession(getCustomPayload());}}. At least that fixes the 
> problem for me.
> This also raises the question whether some other parts of the code should 
> pass the custom payload as well (I'm not the right person to analyze this):
> {code}
> $ ag createTracingSession
> src/java/org/apache/cassandra/service/QueryState.java
> 80:public void createTracingSession()
> 82:createTracingSession(Collections.EMPTY_MAP);
> 85:public void createTracingSession(Map customPayload)
> src/java/org/apache/cassandra/thrift/CassandraServer.java
> 2528:state().getQueryState().createTracingSession();
> src/java/org/apache/cassandra/transport/messages/BatchMessage.java
> 163:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java
> 114:state.createTracingSession(getCustomPayload());
> src/java/org/apache/cassandra/transport/messages/QueryMessage.java
> 101:state.createTracingSession();
> src/java/org/apache/cassandra/transport/messages/PrepareMessage.java
> 74:state.createTracingSession();
> {code}
> This is not marked as `minor` as the CASSANDRA-11706 was because this cannot 
> be fixed by the tracing plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Aleksandr Ivanov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975528#comment-15975528
 ] 

Aleksandr Ivanov commented on CASSANDRA-13463:
--

Forgot to mention, that same command could work on 2nd, 3rd try. But if period 
is long (>10s) then it doesn't work at all.
Paritions are 2..3KB big, read rate on node ~300/s, write rate ~100/s

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975514#comment-15975514
 ] 

Chris Lohfink commented on CASSANDRA-13463:
---

relevant change : 
https://github.com/aweisberg/cassandra/commit/e5c7992ea3099bb90930cad4282803fb6556de18#diff-98f5acb96aa6d684781936c141132e2aR1481

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975509#comment-15975509
 ] 

Chris Lohfink commented on CASSANDRA-13463:
---

I think it would fix it at least. The `.array()` doesn't work when its not 
backed by an array (ie when partition comes from a files byte buffer). 
[~aweisberg]'s fix had nothing to do with that but he changed it to use the 
byte buffer properly.

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA

[jira] [Comment Edited] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975509#comment-15975509
 ] 

Chris Lohfink edited comment on CASSANDRA-13463 at 4/19/17 8:59 PM:


I think it would fix it at least. The {{.array()}} doesn't work when its not 
backed by an array (ie when partition comes from a files byte buffer). 
[~aweisberg]'s fix had nothing to do with that but he changed it to use the 
byte buffer properly.


was (Author: cnlwsu):
I think it would fix it at least. The `.array()` doesn't work when its not 
backed by an array (ie when partition comes from a files byte buffer). 
[~aweisberg]'s fix had nothing to do with that but he changed it to use the 
byte buffer properly.

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> 

[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Aleksandr Ivanov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975501#comment-15975501
 ] 

Aleksandr Ivanov commented on CASSANDRA-13463:
--

Unfortunately not. Don't have 3.11 or 3.2+ environment. But I can build patched 
3.0.x in order to test CASSANDRA-9241 fix.

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975478#comment-15975478
 ] 

Chris Lohfink edited comment on CASSANDRA-13463 at 4/19/17 8:43 PM:


I believe this was fixed in CASSANDRA-9241. Can you try on a 3.11 or 3.2+ 
version?


was (Author: cnlwsu):
fixed in CASSANDRA-9241

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975478#comment-15975478
 ] 

Chris Lohfink commented on CASSANDRA-13463:
---

fixed in CASSANDRA-9241

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975468#comment-15975468
 ] 

Chris Lohfink commented on CASSANDRA-13463:
---

can you provide schema for table your doing this on and the partitions your 
inserting/reading? I cannot reproduce with simple tables.

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It is easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Aleksandr Ivanov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Ivanov updated CASSANDRA-13463:
-
Description: 
nodetool toppartitions doesn't work for most of runs and failing with following 
message
{code}
error: String didn't validate.
-- StackTrace --
org.apache.cassandra.serializers.MarshalException: String didn't validate.
at 
org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
at 
org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
at 
org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at 
com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

It is easily reproducible if period is longer that 1 second.

  was:
nodetool toppartitions doesn't work for most of runs and failing with following 
message
{code}
error: String didn't validate.
-- StackTrace --
org.apache.cassandra.serializers.MarshalException: String didn't validate.
at 
org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
at 
org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
at 
org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at 

[jira] [Commented] (CASSANDRA-13006) Disable automatic heap dumps on OOM error

2017-04-19 Thread Nibin G (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975265#comment-15975265
 ] 

Nibin G commented on CASSANDRA-13006:
-

Why can't we delegate the heap dump generation to JVM if jmap is not available 
in class path ? JRE can generate heap dump even if jmap is not there in the 
path.

> Disable automatic heap dumps on OOM error
> -
>
> Key: CASSANDRA-13006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13006
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: anmols
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.9
>
> Attachments: 13006-3.0.9.txt
>
>
> With CASSANDRA-9861, a change was added to enable collecting heap dumps by 
> default if the process encountered an OOM error. These heap dumps are stored 
> in the Apache Cassandra home directory unless configured otherwise (see 
> [Cassandra Support 
> Document|https://support.datastax.com/hc/en-us/articles/204225959-Generating-and-Analyzing-Heap-Dumps]
>  for this feature).
>  
> The creation and storage of heap dumps aides debugging and investigative 
> workflows, but is not be desirable for a production environment where these 
> heap dumps may occupy a large amount of disk space and require manual 
> intervention for cleanups. 
>  
> Managing heap dumps on out of memory errors and configuring the paths for 
> these heap dumps are available as JVM options in JVM. The current behavior 
> conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
>  
> A patch can be proposed here that would make the heap dump on OOM error honor 
> the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate 
> heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM 
> option.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13442) Support a means of strongly consistent highly available replication with storage requirements approximating RF=2

2017-04-19 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13442:
---
Description: 
Replication factors like RF=2 can't provide strong consistency and availability 
because if a single node is lost it's impossible to reach a quorum of replicas. 
Stepping up to RF=3 will allow you to lose a node and still achieve quorum for 
reads and writes, but requires committing additional storage.

The requirement of a quorum for writes/reads doesn't seem to be something that 
can be relaxed without additional constraints on queries, but it seems like it 
should be possible to relax the requirement that 3 full copies of the entire 
data set are kept. What is actually required is a covering data set for the 
range and we should be able to achieve a covering data set and high 
availability without having three full copies. 

After a repair we know that some subset of the data set is fully replicated. At 
that point we don't have to read from a quorum of nodes for the repaired data. 
It is sufficient to read from a single node for the repaired data and a quorum 
of nodes for the unrepaired data.

One way to exploit this would be to have N replicas, say the last N replicas 
(where N varies with RF) in the preference list, delete all repaired data after 
a repair completes. Subsequent quorum reads will be able to retrieve the 
repaired data from any of the two full replicas and the unrepaired data from a 
quorum read of any replica including the "transient" replicas.

Configuration for something like this in NTS might be something similar to { 
DC1="3-1", DC2="3-2" } where the first value is the replication factor used for 
consistency and the second values is the number of transient replicas. If you 
specify { DC1=3, DC2=3 } then the number of transient replicas defaults to 0 
and you get the same behavior you have today.


  was:
Replication factors like RF=2 can't provide strong consistency and availability 
because if a single node is lost it's impossible to reach a quorum of replicas. 
Stepping up to RF=3 will allow you to lose a node and still achieve quorum for 
reads and writes, but requires committing additional storage.

The requirement of a quorum for writes/reads doesn't seem to be something that 
can be relaxed without additional constraints on queries, but it seems like it 
should be possible to relax the requirement that 3 full copies of the entire 
data set are kept. What is actually required is a covering data set for the 
range and we should be able to achieve a covering data set and high 
availability without having three full copies. 

After a repair we know that some subset of the data set is fully replicated. At 
that point we don't have to read from a quorum of nodes for the repaired data. 
It is sufficient to read from a single node for the repaired data and a quorum 
of nodes for the unrepaired data.

One way to exploit this would be to have N replicas, say the last N replicas 
(where N varies with RF) in the preference list, delete all repaired data after 
a repair completes. Subsequent quorum reads will be able to retrieve the 
repaired data from any of the two full replicas and the unrepaired data from a 
quorum read of any replica including the "transient" replicas.


> Support a means of strongly consistent highly available replication with 
> storage requirements approximating RF=2
> 
>
> Key: CASSANDRA-13442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13442
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Coordination, Distributed Metadata, Local 
> Write-Read Paths
>Reporter: Ariel Weisberg
>
> Replication factors like RF=2 can't provide strong consistency and 
> availability because if a single node is lost it's impossible to reach a 
> quorum of replicas. Stepping up to RF=3 will allow you to lose a node and 
> still achieve quorum for reads and writes, but requires committing additional 
> storage.
> The requirement of a quorum for writes/reads doesn't seem to be something 
> that can be relaxed without additional constraints on queries, but it seems 
> like it should be possible to relax the requirement that 3 full copies of the 
> entire data set are kept. What is actually required is a covering data set 
> for the range and we should be able to achieve a covering data set and high 
> availability without having three full copies. 
> After a repair we know that some subset of the data set is fully replicated. 
> At that point we don't have to read from a quorum of nodes for the repaired 
> data. It is sufficient to read from a single node for the repaired data and a 
> quorum of nodes for the unrepaired data.
> One way to 

[jira] [Updated] (CASSANDRA-13442) Support a means of strongly consistent highly available replication with tunable storage requirements

2017-04-19 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13442:
---
Summary: Support a means of strongly consistent highly available 
replication with tunable storage requirements  (was: Support a means of 
strongly consistent highly available replication with storage requirements 
approximating RF=2)

> Support a means of strongly consistent highly available replication with 
> tunable storage requirements
> -
>
> Key: CASSANDRA-13442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13442
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Coordination, Distributed Metadata, Local 
> Write-Read Paths
>Reporter: Ariel Weisberg
>
> Replication factors like RF=2 can't provide strong consistency and 
> availability because if a single node is lost it's impossible to reach a 
> quorum of replicas. Stepping up to RF=3 will allow you to lose a node and 
> still achieve quorum for reads and writes, but requires committing additional 
> storage.
> The requirement of a quorum for writes/reads doesn't seem to be something 
> that can be relaxed without additional constraints on queries, but it seems 
> like it should be possible to relax the requirement that 3 full copies of the 
> entire data set are kept. What is actually required is a covering data set 
> for the range and we should be able to achieve a covering data set and high 
> availability without having three full copies. 
> After a repair we know that some subset of the data set is fully replicated. 
> At that point we don't have to read from a quorum of nodes for the repaired 
> data. It is sufficient to read from a single node for the repaired data and a 
> quorum of nodes for the unrepaired data.
> One way to exploit this would be to have N replicas, say the last N replicas 
> (where N varies with RF) in the preference list, delete all repaired data 
> after a repair completes. Subsequent quorum reads will be able to retrieve 
> the repaired data from any of the two full replicas and the unrepaired data 
> from a quorum read of any replica including the "transient" replicas.
> Configuration for something like this in NTS might be something similar to { 
> DC1="3-1", DC2="3-2" } where the first value is the replication factor used 
> for consistency and the second values is the number of transient replicas. If 
> you specify { DC1=3, DC2=3 } then the number of transient replicas defaults 
> to 0 and you get the same behavior you have today.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13006) Disable automatic heap dumps on OOM error

2017-04-19 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975192#comment-15975192
 ] 

Jeremiah Jordan commented on CASSANDRA-13006:
-

We could fall back to trying to use the 
"com.sun.management:type=HotSpotDiagnostic" bean directly if we can't find jmap.

Some links for doing this:
https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java
http://stackoverflow.com/a/12297339/138693

> Disable automatic heap dumps on OOM error
> -
>
> Key: CASSANDRA-13006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13006
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: anmols
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.9
>
> Attachments: 13006-3.0.9.txt
>
>
> With CASSANDRA-9861, a change was added to enable collecting heap dumps by 
> default if the process encountered an OOM error. These heap dumps are stored 
> in the Apache Cassandra home directory unless configured otherwise (see 
> [Cassandra Support 
> Document|https://support.datastax.com/hc/en-us/articles/204225959-Generating-and-Analyzing-Heap-Dumps]
>  for this feature).
>  
> The creation and storage of heap dumps aides debugging and investigative 
> workflows, but is not be desirable for a production environment where these 
> heap dumps may occupy a large amount of disk space and require manual 
> intervention for cleanups. 
>  
> Managing heap dumps on out of memory errors and configuring the paths for 
> these heap dumps are available as JVM options in JVM. The current behavior 
> conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
>  
> A patch can be proposed here that would make the heap dump on OOM error honor 
> the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate 
> heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM 
> option.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13006) Disable automatic heap dumps on OOM error

2017-04-19 Thread Nibin G (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975153#comment-15975153
 ] 

Nibin G edited comment on CASSANDRA-13006 at 4/19/17 5:49 PM:
--

Oracle Java's JRE 8 and Server JRE 8 for linux environments are not shipping 
jmap anymore. That means, we have to use Oracle Java's JDK for the heap dumps 
to be generated from cassandra. And some of the security compliance won't 
permit the use of JDK in production.

It would be great if an option is provided to disable heap dump from the 
application code[1]. So that JVM can generate the heap dump. Or use jcmd 
utility (available in server-jre 8 and jdk 8) instead of jmap.

[1] 
https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L56
 


was (Author: nibin.gv):
Oracle Java's JRE 8 and Server JRE 8 for linux environments are not shipping 
jmap anymore. That means, we have to use Oracle Java's JDK for the heap dumps 
to be generated. And some of the security compliance won't permit the use of 
JDK in production.

It would be great if an option is provided to disable heap dump from the 
application code[1]. So that JVM can generate the heap dump.

[1] 
https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L56
 

> Disable automatic heap dumps on OOM error
> -
>
> Key: CASSANDRA-13006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13006
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: anmols
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.9
>
> Attachments: 13006-3.0.9.txt
>
>
> With CASSANDRA-9861, a change was added to enable collecting heap dumps by 
> default if the process encountered an OOM error. These heap dumps are stored 
> in the Apache Cassandra home directory unless configured otherwise (see 
> [Cassandra Support 
> Document|https://support.datastax.com/hc/en-us/articles/204225959-Generating-and-Analyzing-Heap-Dumps]
>  for this feature).
>  
> The creation and storage of heap dumps aides debugging and investigative 
> workflows, but is not be desirable for a production environment where these 
> heap dumps may occupy a large amount of disk space and require manual 
> intervention for cleanups. 
>  
> Managing heap dumps on out of memory errors and configuring the paths for 
> these heap dumps are available as JVM options in JVM. The current behavior 
> conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
>  
> A patch can be proposed here that would make the heap dump on OOM error honor 
> the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate 
> heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM 
> option.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13006) Disable automatic heap dumps on OOM error

2017-04-19 Thread Nibin G (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975153#comment-15975153
 ] 

Nibin G commented on CASSANDRA-13006:
-

Oracle Java's JRE 8 and Server JRE 8 for linux environments are not shipping 
jmap anymore. That means, we have to use Oracle Java's JDK for the heap dumps 
to be generated. And some of the security compliance won't permit the use of 
JDK in production.

It would be great if an option is provided to disable heap dump from the 
application code[1]. So that JVM can generate the heap dump.

[1] 
https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L56
 

> Disable automatic heap dumps on OOM error
> -
>
> Key: CASSANDRA-13006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13006
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: anmols
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.9
>
> Attachments: 13006-3.0.9.txt
>
>
> With CASSANDRA-9861, a change was added to enable collecting heap dumps by 
> default if the process encountered an OOM error. These heap dumps are stored 
> in the Apache Cassandra home directory unless configured otherwise (see 
> [Cassandra Support 
> Document|https://support.datastax.com/hc/en-us/articles/204225959-Generating-and-Analyzing-Heap-Dumps]
>  for this feature).
>  
> The creation and storage of heap dumps aides debugging and investigative 
> workflows, but is not be desirable for a production environment where these 
> heap dumps may occupy a large amount of disk space and require manual 
> intervention for cleanups. 
>  
> Managing heap dumps on out of memory errors and configuring the paths for 
> these heap dumps are available as JVM options in JVM. The current behavior 
> conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
>  
> A patch can be proposed here that would make the heap dump on OOM error honor 
> the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate 
> heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM 
> option.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-04-19 Thread Corentin Chary (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corentin Chary updated CASSANDRA-13418:
---

Agreed for the option. Would be easy to implement it using a new one.
IMOH it's more dangerous to have nothing as this would degrade write
performances and take up to twice the space originally planned.  Compared
to that it isn't really an issue to have re-appearing data after an
explicit deletion (I think that's the worse that can happen, can be wrong)




> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13365) Nodes entering GC loop, does not recover

2017-04-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975075#comment-15975075
 ] 

Jeff Jirsa commented on CASSANDRA-13365:


{code}
   9:   2870889  160769784  
org.apache.cassandra.transport.messages.ResultMessage$Rows
  10:   2937336  140992128  io.netty.buffer.SlicedAbstractByteBuf
  11:  8854  118773984  [Lio.netty.util.Recycler$DefaultHandle;
  12:   2830805  113232200  org.apache.cassandra.db.rows.BufferCell
  13:   2937336   93994752  org.apache.cassandra.transport.Frame$Header
  14:   2870928   91869696  
org.apache.cassandra.cql3.ResultSet$ResultMetadata
  15:   2728627   87316064  org.apache.cassandra.db.rows.BTreeRow
{code}

2.8M ResultMessage$Rows, BufferCells, ResultSets, and BTreeRows suggests you 
have an awful lot of read results in flight, and you've filled the heap with 
them.

Is it possible someone is doing a query with a very, very large LIMIT rather 
than using driver's fetchSize() or manual paging? Or do you have concurrent 
read threads turned up very high? 


> Nodes entering GC loop, does not recover
> 
>
> Key: CASSANDRA-13365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13365
> Project: Cassandra
>  Issue Type: Bug
> Environment: 34-node cluster over 4 DCs
> Linux CentOS 7.2 x86
> Mix of 64GB/128GB RAM / node
> Mix of 32/40 hardware threads / node, Xeon ~2.4Ghz
> High read volume, low write volume, occasional sstable bulk loading
>Reporter: Mina Naguib
>
> Over the last week we've been observing two related problems affecting our 
> Cassandra cluster
> Problem 1: 1-few nodes per DC entering GC loop, not recovering
> Checking the heap usage stats, there's a sudden jump of 1-3GB. Some nodes 
> recover, but some don't and log this:
> {noformat}
> 2017-03-21T11:23:02.957-0400: 54099.519: [Full GC (Allocation Failure)  
> 13G->11G(14G), 29.4127307 secs]
> 2017-03-21T11:23:45.270-0400: 54141.833: [Full GC (Allocation Failure)  
> 13G->12G(14G), 28.1561881 secs]
> 2017-03-21T11:24:20.307-0400: 54176.869: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.7019501 secs]
> 2017-03-21T11:24:50.528-0400: 54207.090: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1372267 secs]
> 2017-03-21T11:25:19.190-0400: 54235.752: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.0703975 secs]
> 2017-03-21T11:25:46.711-0400: 54263.273: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3187768 secs]
> 2017-03-21T11:26:15.419-0400: 54291.981: [Full GC (Allocation Failure)  
> 13G->13G(14G), 26.9493405 secs]
> 2017-03-21T11:26:43.399-0400: 54319.961: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5222085 secs]
> 2017-03-21T11:27:11.383-0400: 54347.945: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1769581 secs]
> 2017-03-21T11:27:40.174-0400: 54376.737: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4639031 secs]
> 2017-03-21T11:28:08.946-0400: 54405.508: [Full GC (Allocation Failure)  
> 13G->13G(14G), 30.3480523 secs]
> 2017-03-21T11:28:40.117-0400: 54436.680: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8220513 secs]
> 2017-03-21T11:29:08.459-0400: 54465.022: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4691271 secs]
> 2017-03-21T11:29:37.114-0400: 54493.676: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.0275733 secs]
> 2017-03-21T11:30:04.635-0400: 54521.198: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1902627 secs]
> 2017-03-21T11:30:32.114-0400: 54548.676: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8872850 secs]
> 2017-03-21T11:31:01.430-0400: 54577.993: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1609706 secs]
> 2017-03-21T11:31:29.024-0400: 54605.587: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3635138 secs]
> 2017-03-21T11:31:57.303-0400: 54633.865: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4143510 secs]
> 2017-03-21T11:32:25.110-0400: 54661.672: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8595986 secs]
> 2017-03-21T11:32:53.922-0400: 54690.485: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5242543 secs]
> 2017-03-21T11:33:21.867-0400: 54718.429: [Full GC (Allocation Failure)  
> 13G->13G(14G), 30.8930130 secs]
> 2017-03-21T11:33:53.712-0400: 54750.275: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.6523013 secs]
> 2017-03-21T11:34:21.760-0400: 54778.322: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3030198 secs]
> 2017-03-21T11:34:50.073-0400: 54806.635: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1594154 secs]
> 2017-03-21T11:35:17.743-0400: 54834.306: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3766949 secs]
> 2017-03-21T11:35:45.797-0400: 54862.360: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5756770 secs]
> 2017-03-21T11:36:13.816-0400: 54890.378: [Full GC 

[jira] [Commented] (CASSANDRA-13365) Nodes entering GC loop, does not recover

2017-04-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975031#comment-15975031
 ] 

ZhaoYang commented on CASSANDRA-13365:
--

[~minaguib] could you share what queries are running at that moment? or you 
could try `sjk-plus` to see which thread is allocating huge memory

> Nodes entering GC loop, does not recover
> 
>
> Key: CASSANDRA-13365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13365
> Project: Cassandra
>  Issue Type: Bug
> Environment: 34-node cluster over 4 DCs
> Linux CentOS 7.2 x86
> Mix of 64GB/128GB RAM / node
> Mix of 32/40 hardware threads / node, Xeon ~2.4Ghz
> High read volume, low write volume, occasional sstable bulk loading
>Reporter: Mina Naguib
>
> Over the last week we've been observing two related problems affecting our 
> Cassandra cluster
> Problem 1: 1-few nodes per DC entering GC loop, not recovering
> Checking the heap usage stats, there's a sudden jump of 1-3GB. Some nodes 
> recover, but some don't and log this:
> {noformat}
> 2017-03-21T11:23:02.957-0400: 54099.519: [Full GC (Allocation Failure)  
> 13G->11G(14G), 29.4127307 secs]
> 2017-03-21T11:23:45.270-0400: 54141.833: [Full GC (Allocation Failure)  
> 13G->12G(14G), 28.1561881 secs]
> 2017-03-21T11:24:20.307-0400: 54176.869: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.7019501 secs]
> 2017-03-21T11:24:50.528-0400: 54207.090: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1372267 secs]
> 2017-03-21T11:25:19.190-0400: 54235.752: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.0703975 secs]
> 2017-03-21T11:25:46.711-0400: 54263.273: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3187768 secs]
> 2017-03-21T11:26:15.419-0400: 54291.981: [Full GC (Allocation Failure)  
> 13G->13G(14G), 26.9493405 secs]
> 2017-03-21T11:26:43.399-0400: 54319.961: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5222085 secs]
> 2017-03-21T11:27:11.383-0400: 54347.945: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1769581 secs]
> 2017-03-21T11:27:40.174-0400: 54376.737: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4639031 secs]
> 2017-03-21T11:28:08.946-0400: 54405.508: [Full GC (Allocation Failure)  
> 13G->13G(14G), 30.3480523 secs]
> 2017-03-21T11:28:40.117-0400: 54436.680: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8220513 secs]
> 2017-03-21T11:29:08.459-0400: 54465.022: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4691271 secs]
> 2017-03-21T11:29:37.114-0400: 54493.676: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.0275733 secs]
> 2017-03-21T11:30:04.635-0400: 54521.198: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1902627 secs]
> 2017-03-21T11:30:32.114-0400: 54548.676: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8872850 secs]
> 2017-03-21T11:31:01.430-0400: 54577.993: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1609706 secs]
> 2017-03-21T11:31:29.024-0400: 54605.587: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3635138 secs]
> 2017-03-21T11:31:57.303-0400: 54633.865: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4143510 secs]
> 2017-03-21T11:32:25.110-0400: 54661.672: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.8595986 secs]
> 2017-03-21T11:32:53.922-0400: 54690.485: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5242543 secs]
> 2017-03-21T11:33:21.867-0400: 54718.429: [Full GC (Allocation Failure)  
> 13G->13G(14G), 30.8930130 secs]
> 2017-03-21T11:33:53.712-0400: 54750.275: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.6523013 secs]
> 2017-03-21T11:34:21.760-0400: 54778.322: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3030198 secs]
> 2017-03-21T11:34:50.073-0400: 54806.635: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.1594154 secs]
> 2017-03-21T11:35:17.743-0400: 54834.306: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3766949 secs]
> 2017-03-21T11:35:45.797-0400: 54862.360: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5756770 secs]
> 2017-03-21T11:36:13.816-0400: 54890.378: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5541813 secs]
> 2017-03-21T11:36:41.926-0400: 54918.488: [Full GC (Allocation Failure)  
> 13G->13G(14G), 33.7510103 secs]
> 2017-03-21T11:37:16.132-0400: 54952.695: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.4856611 secs]
> 2017-03-21T11:37:44.454-0400: 54981.017: [Full GC (Allocation Failure)  
> 13G->13G(14G), 28.1269335 secs]
> 2017-03-21T11:38:12.774-0400: 55009.337: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.7830448 secs]
> 2017-03-21T11:38:40.840-0400: 55037.402: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.3527326 secs]
> 2017-03-21T11:39:08.610-0400: 55065.173: [Full GC (Allocation Failure)  
> 13G->13G(14G), 27.5828941 secs]
> 2017-03-21T11:39:36.833-0400: 55093.396: [Full GC (Allocation Failure)  
> 

[jira] [Updated] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.

2017-04-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13307:
---
Fix Version/s: (was: 3.11.x)
   3.11.0

> The specification of protocol version in cqlsh means the python driver 
> doesn't automatically downgrade protocol version.
> 
>
> Key: CASSANDRA-13307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Matt Byrd
>Assignee: Matt Byrd
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 3.11.0, 4.0
>
>
> Hi,
> Looks like we've regressed on the issue described in:
> https://issues.apache.org/jira/browse/CASSANDRA-9467
> In that we're no longer able to connect from newer cqlsh versions
> (e.g trunk) to older versions of Cassandra with a lower version of the 
> protocol (e.g 2.1 with protocol version 3)
> The problem seems to be that we're relying on the ability for the client to 
> automatically downgrade protocol version implemented in Cassandra here:
> https://issues.apache.org/jira/browse/CASSANDRA-12838
> and utilised in the python client here:
> https://datastax-oss.atlassian.net/browse/PYTHON-240
> The problem however comes when we implemented:
> https://datastax-oss.atlassian.net/browse/PYTHON-537
> "Don't downgrade protocol version if explicitly set" 
> (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of 
> fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534)
> Since we do explicitly specify the protocol version in the bin/cqlsh.py.
> I've got a patch which just adds an option to explicitly specify the protocol 
> version (for those who want to do that) and then otherwise defaults to not 
> setting the protocol version, i.e using the protocol version from the client 
> which we ship, which should by default be the same protocol as the server.
> Then it should downgrade gracefully as was intended. 
> Let me know if that seems reasonable.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13463:
---
Reproduced In: 3.0.11

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It can easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13463:
---
Component/s: Observability

> nodetool toppartitions - error: String didn't validate
> --
>
> Key: CASSANDRA-13463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
> Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
>Reporter: Aleksandr Ivanov
>
> nodetool toppartitions doesn't work for most of runs and failing with 
> following message
> {code}
> error: String didn't validate.
> -- StackTrace --
> org.apache.cassandra.serializers.MarshalException: String didn't validate.
> at 
> org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
> at 
> org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It can easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-04-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974996#comment-15974996
 ] 

Jeff Jirsa commented on CASSANDRA-13418:


[~iksaif] - this isn't a review, but at first glance I'm not in love with the 
idea of extending that config option in that way - it wasn't meant for that 
purpose, though it's sort of tangential (it was really meant for the specific 
task of grouping sstables for the cleanup compaction, and this isn't the 
cleanup compaction). 

There's also the bigger question of whether or not we really want to expose 
this to users. It's dangerous. I really really wanted something like this at my 
last employer, but the "this can be dangerous" factor prevented me from writing 
it. 




> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13463) nodetool toppartitions - error: String didn't validate

2017-04-19 Thread Aleksandr Ivanov (JIRA)
Aleksandr Ivanov created CASSANDRA-13463:


 Summary: nodetool toppartitions - error: String didn't validate
 Key: CASSANDRA-13463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13463
 Project: Cassandra
  Issue Type: Bug
 Environment: Debian Jessie, Java 1.8.0-121, Cassandra v3.0.11
Reporter: Aleksandr Ivanov


nodetool toppartitions doesn't work for most of runs and failing with following 
message
{code}
error: String didn't validate.
-- StackTrace --
org.apache.cassandra.serializers.MarshalException: String didn't validate.
at 
org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
at 
org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:128)
at 
org.apache.cassandra.db.ColumnFamilyStore.finishLocalSampling(ColumnFamilyStore.java:1559)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at 
com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

It can easily reproducible if period is longer that 1 second.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission

2017-04-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13308:
---
   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)
   4.0
   3.11.0
   3.0.14
   Status: Resolved  (was: Ready to Commit)

Thanks Aleksey. Committed with nits as 
[5089e74ef4a0eaeb1c439d57f074de1c496421f2|https://github.com/apache/cassandra/commit/5089e74ef4a0eaeb1c439d57f074de1c496421f2]

> Gossip breaks, Hint files not being deleted on nodetool decommission
> 
>
> Key: CASSANDRA-13308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Using Cassandra version 3.0.9
>Reporter: Arijit
>Assignee: Jeff Jirsa
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: 28207.stack, logs, logs_decommissioned_node
>
>
> How to reproduce the issue I'm seeing:
> Shut down Cassandra on one node of the cluster and wait until we accumulate a 
> ton of hints. Start Cassandra on the node and immediately run "nodetool 
> decommission" on it.
> The node streams its replicas and marks itself as DECOMMISSIONED, but other 
> nodes do not seem to see this message. "nodetool status" shows the 
> decommissioned node in state "UL" on all other nodes (it is also present in 
> system.peers), and Cassandra logs show that gossip tasks on nodes are not 
> proceeding (number of pending tasks keeps increasing). Jstack suggests that a 
> gossip task is blocked on hints dispatch (I can provide traces if this is not 
> obvious). Because the cluster is large and there are a lot of hints, this is 
> taking a while. 
> On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint 
> files for the decommissioned node. Documentation seems to suggest that these 
> hints should be deleted during "nodetool decommission", but it does not seem 
> to be the case here. This is the bug being reported.
> To recover from this scenario, if I manually delete hint files on the nodes, 
> the hints dispatcher threads throw a bunch of exceptions and the 
> decommissioned node is now in state "DL" (perhaps it missed some gossip 
> messages?). The node is still in my "system.peers" table
> Restarting Cassandra on all nodes after this step does not fix the issue (the 
> node remains in the peers table). In fact, after this point the 
> decommissioned node is in state "DN"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[3/6] cassandra git commit: Interrupt replaying hints on decommission

2017-04-19 Thread jjirsa
Interrupt replaying hints on decommission

Patch by Jeff Jirsa; Reviewed by Aleksey Yeschenko for CASSANDRA-13308


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5089e74e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5089e74e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5089e74e

Branch: refs/heads/trunk
Commit: 5089e74ef4a0eaeb1c439d57f074de1c496421f2
Parents: 3110d27
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:26:02 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:57:45 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 918c46b..e55d4cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,6 @@
 3.0.14
  * Handling partially written hint files (CASSANDRA-12728) 
+ * Interrupt replaying hints on decommission (CASSANDRA-13308)
 
 3.0.13
  * Make reading of range tombstones more reliable (CASSANDRA-12811)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java 
b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
index 333232d..58b30bd 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
@@ -117,6 +117,14 @@ final class HintsDispatchExecutor
 }
 }
 
+void interruptDispatch(UUID hostId)
+{
+Future future = scheduledDispatches.remove(hostId);
+
+if (null != future)
+future.cancel(true);
+}
+
 private final class TransferHintsTask implements Runnable
 {
 private final HintsCatalog catalog;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatcher.java 
b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
index d7a3515..351b3fa 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@ -26,6 +26,8 @@ import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.function.Function;
 
 import com.google.common.util.concurrent.RateLimiter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.gms.FailureDetector;
@@ -42,6 +44,8 @@ import org.apache.cassandra.utils.concurrent.SimpleCondition;
  */
 final class HintsDispatcher implements AutoCloseable
 {
+private static final Logger logger = 
LoggerFactory.getLogger(HintsDispatcher.class);
+
 private enum Action { CONTINUE, ABORT }
 
 private final HintsReader reader;
@@ -181,7 +185,7 @@ final class HintsDispatcher implements AutoCloseable
 
 private static final class Callback implements IAsyncCallbackWithFailure
 {
-enum Outcome { SUCCESS, TIMEOUT, FAILURE }
+enum Outcome { SUCCESS, TIMEOUT, FAILURE, INTERRUPTED }
 
 private final long start = System.nanoTime();
 private final SimpleCondition condition = new SimpleCondition();
@@ -198,7 +202,8 @@ final class HintsDispatcher implements AutoCloseable
 }
 catch (InterruptedException e)
 {
-throw new AssertionError(e);
+logger.warn("Hint dispatch was interrupted", e);
+return Outcome.INTERRUPTED;
 }
 
 return timedOut ? Outcome.TIMEOUT : outcome;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsService.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsService.java 
b/src/java/org/apache/cassandra/hints/HintsService.java
index 5a32786..9cd4ed3 100644
--- a/src/java/org/apache/cassandra/hints/HintsService.java
+++ b/src/java/org/apache/cassandra/hints/HintsService.java
@@ -287,10 +287,11 @@ public final class HintsService implements 
HintsServiceMBean
 /**
  * Cleans up hints-related state 

[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-04-19 Thread jjirsa
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5f644548
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5f644548
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5f644548

Branch: refs/heads/cassandra-3.11
Commit: 5f6445480341fbcbf15cdf36f4dda5f1b1a93102
Parents: 9c54d02 5089e74
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:58:02 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:58:45 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/CHANGES.txt
--
diff --cc CHANGES.txt
index 1757266,e55d4cb..92ecb39
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,33 -1,8 +1,34 @@@
 -3.0.14
 - * Handling partially written hint files (CASSANDRA-12728) 
 +3.11.0
 + * V5 protocol flags decoding broken (CASSANDRA-13443)
 + * Use write lock not read lock for removing sstables from compaction 
strategies. (CASSANDRA-13422)
 + * Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors 
(CASSANDRA-13329)
 + * Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)
 + * Add charset to Analyser input stream (CASSANDRA-13151)
 + * Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820)
 + * cdc column addition strikes again (CASSANDRA-13382)
 + * Fix static column indexes (CASSANDRA-13277)
 + * DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298)
 + * unittest CipherFactoryTest failed on MacOS (CASSANDRA-13370)
 + * Forbid SELECT restrictions and CREATE INDEX over non-frozen UDT columns 
(CASSANDRA-13247)
 + * Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern 
(CASSANDRA-13317)
 + * Possible AssertionError in UnfilteredRowIteratorWithLowerBound 
(CASSANDRA-13366)
 + * Support unaligned memory access for AArch64 (CASSANDRA-13326)
 + * Improve SASI range iterator efficiency on intersection with an empty range 
(CASSANDRA-12915).
 + * Fix equality comparisons of columns using the duration type 
(CASSANDRA-13174)
 + * Obfuscate password in stress-graphs (CASSANDRA-12233)
 + * Move to FastThreadLocalThread and FastThreadLocal (CASSANDRA-13034)
 + * nodetool stopdaemon errors out (CASSANDRA-13030)
 + * Tables in system_distributed should not use gcgs of 0 (CASSANDRA-12954)
 + * Fix primary index calculation for SASI (CASSANDRA-12910)
 + * More fixes to the TokenAllocator (CASSANDRA-12990)
 + * NoReplicationTokenAllocator should work with zero replication factor 
(CASSANDRA-12983)
 + * Address message coalescing regression (CASSANDRA-12676)
 + * Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
 + * Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
 +Merged from 3.0:
+  * Interrupt replaying hints on decommission (CASSANDRA-13308)
 -
 -3.0.13
 + * Handling partially written hint files (CASSANDRA-12728)
 + * Fix NPE issue in StorageService (CASSANDRA-13060)
   * Make reading of range tombstones more reliable (CASSANDRA-12811)
   * Fix startup problems due to schema tables not completely flushed 
(CASSANDRA-12213)
   * Fix view builder bug that can filter out data on restart (CASSANDRA-13405)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --cc src/java/org/apache/cassandra/hints/HintsDispatcher.java
index 3ac77a3,351b3fa..c432553
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@@ -26,9 -26,11 +26,11 @@@ import java.util.function.BooleanSuppli
  import java.util.function.Function;
  
  import com.google.common.util.concurrent.RateLimiter;
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
  
 -import org.apache.cassandra.config.DatabaseDescriptor;
 -import org.apache.cassandra.gms.FailureDetector;
 +import org.apache.cassandra.exceptions.RequestFailureReason;
 +import org.apache.cassandra.metrics.HintsServiceMetrics;
  import org.apache.cassandra.net.IAsyncCallbackWithFailure;
  import org.apache.cassandra.net.MessageIn;
  import 

[1/6] cassandra git commit: Interrupt replaying hints on decommission

2017-04-19 Thread jjirsa
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 3110d27dd -> 5089e74ef
  refs/heads/cassandra-3.11 9c54d02f7 -> 5f6445480
  refs/heads/trunk 08c216d12 -> 9b1295e41


Interrupt replaying hints on decommission

Patch by Jeff Jirsa; Reviewed by Aleksey Yeschenko for CASSANDRA-13308


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5089e74e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5089e74e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5089e74e

Branch: refs/heads/cassandra-3.0
Commit: 5089e74ef4a0eaeb1c439d57f074de1c496421f2
Parents: 3110d27
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:26:02 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:57:45 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 918c46b..e55d4cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,6 @@
 3.0.14
  * Handling partially written hint files (CASSANDRA-12728) 
+ * Interrupt replaying hints on decommission (CASSANDRA-13308)
 
 3.0.13
  * Make reading of range tombstones more reliable (CASSANDRA-12811)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java 
b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
index 333232d..58b30bd 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
@@ -117,6 +117,14 @@ final class HintsDispatchExecutor
 }
 }
 
+void interruptDispatch(UUID hostId)
+{
+Future future = scheduledDispatches.remove(hostId);
+
+if (null != future)
+future.cancel(true);
+}
+
 private final class TransferHintsTask implements Runnable
 {
 private final HintsCatalog catalog;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatcher.java 
b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
index d7a3515..351b3fa 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@ -26,6 +26,8 @@ import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.function.Function;
 
 import com.google.common.util.concurrent.RateLimiter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.gms.FailureDetector;
@@ -42,6 +44,8 @@ import org.apache.cassandra.utils.concurrent.SimpleCondition;
  */
 final class HintsDispatcher implements AutoCloseable
 {
+private static final Logger logger = 
LoggerFactory.getLogger(HintsDispatcher.class);
+
 private enum Action { CONTINUE, ABORT }
 
 private final HintsReader reader;
@@ -181,7 +185,7 @@ final class HintsDispatcher implements AutoCloseable
 
 private static final class Callback implements IAsyncCallbackWithFailure
 {
-enum Outcome { SUCCESS, TIMEOUT, FAILURE }
+enum Outcome { SUCCESS, TIMEOUT, FAILURE, INTERRUPTED }
 
 private final long start = System.nanoTime();
 private final SimpleCondition condition = new SimpleCondition();
@@ -198,7 +202,8 @@ final class HintsDispatcher implements AutoCloseable
 }
 catch (InterruptedException e)
 {
-throw new AssertionError(e);
+logger.warn("Hint dispatch was interrupted", e);
+return Outcome.INTERRUPTED;
 }
 
 return timedOut ? Outcome.TIMEOUT : outcome;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsService.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsService.java 
b/src/java/org/apache/cassandra/hints/HintsService.java
index 5a32786..9cd4ed3 100644
--- a/src/java/org/apache/cassandra/hints/HintsService.java

[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-04-19 Thread jjirsa
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9b1295e4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9b1295e4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9b1295e4

Branch: refs/heads/trunk
Commit: 9b1295e419b93a194b2270e5b31b689d3ab05dd2
Parents: 08c216d 5f64454
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:58:55 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:59:29 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9b1295e4/CHANGES.txt
--
diff --cc CHANGES.txt
index 13df7e6,92ecb39..c742570
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -80,7 -24,9 +80,8 @@@
   * NoReplicationTokenAllocator should work with zero replication factor 
(CASSANDRA-12983)
   * Address message coalescing regression (CASSANDRA-12676)
   * Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
 - * Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
  Merged from 3.0:
+  * Interrupt replaying hints on decommission (CASSANDRA-13308)
   * Handling partially written hint files (CASSANDRA-12728)
   * Fix NPE issue in StorageService (CASSANDRA-13060)
   * Make reading of range tombstones more reliable (CASSANDRA-12811)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9b1295e4/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9b1295e4/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --cc src/java/org/apache/cassandra/hints/HintsDispatcher.java
index 4a08540,c432553..323eeb1
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@@ -26,8 -26,9 +26,10 @@@ import java.util.function.BooleanSuppli
  import java.util.function.Function;
  
  import com.google.common.util.concurrent.RateLimiter;
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
  
 +import org.apache.cassandra.db.monitoring.ApproximateTime;
  import org.apache.cassandra.exceptions.RequestFailureReason;
  import org.apache.cassandra.metrics.HintsServiceMetrics;
  import org.apache.cassandra.net.IAsyncCallbackWithFailure;



[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-04-19 Thread jjirsa
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5f644548
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5f644548
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5f644548

Branch: refs/heads/trunk
Commit: 5f6445480341fbcbf15cdf36f4dda5f1b1a93102
Parents: 9c54d02 5089e74
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:58:02 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:58:45 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/CHANGES.txt
--
diff --cc CHANGES.txt
index 1757266,e55d4cb..92ecb39
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,33 -1,8 +1,34 @@@
 -3.0.14
 - * Handling partially written hint files (CASSANDRA-12728) 
 +3.11.0
 + * V5 protocol flags decoding broken (CASSANDRA-13443)
 + * Use write lock not read lock for removing sstables from compaction 
strategies. (CASSANDRA-13422)
 + * Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors 
(CASSANDRA-13329)
 + * Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)
 + * Add charset to Analyser input stream (CASSANDRA-13151)
 + * Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820)
 + * cdc column addition strikes again (CASSANDRA-13382)
 + * Fix static column indexes (CASSANDRA-13277)
 + * DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298)
 + * unittest CipherFactoryTest failed on MacOS (CASSANDRA-13370)
 + * Forbid SELECT restrictions and CREATE INDEX over non-frozen UDT columns 
(CASSANDRA-13247)
 + * Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern 
(CASSANDRA-13317)
 + * Possible AssertionError in UnfilteredRowIteratorWithLowerBound 
(CASSANDRA-13366)
 + * Support unaligned memory access for AArch64 (CASSANDRA-13326)
 + * Improve SASI range iterator efficiency on intersection with an empty range 
(CASSANDRA-12915).
 + * Fix equality comparisons of columns using the duration type 
(CASSANDRA-13174)
 + * Obfuscate password in stress-graphs (CASSANDRA-12233)
 + * Move to FastThreadLocalThread and FastThreadLocal (CASSANDRA-13034)
 + * nodetool stopdaemon errors out (CASSANDRA-13030)
 + * Tables in system_distributed should not use gcgs of 0 (CASSANDRA-12954)
 + * Fix primary index calculation for SASI (CASSANDRA-12910)
 + * More fixes to the TokenAllocator (CASSANDRA-12990)
 + * NoReplicationTokenAllocator should work with zero replication factor 
(CASSANDRA-12983)
 + * Address message coalescing regression (CASSANDRA-12676)
 + * Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
 + * Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
 +Merged from 3.0:
+  * Interrupt replaying hints on decommission (CASSANDRA-13308)
 -
 -3.0.13
 + * Handling partially written hint files (CASSANDRA-12728)
 + * Fix NPE issue in StorageService (CASSANDRA-13060)
   * Make reading of range tombstones more reliable (CASSANDRA-12811)
   * Fix startup problems due to schema tables not completely flushed 
(CASSANDRA-12213)
   * Fix view builder bug that can filter out data on restart (CASSANDRA-13405)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5f644548/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --cc src/java/org/apache/cassandra/hints/HintsDispatcher.java
index 3ac77a3,351b3fa..c432553
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@@ -26,9 -26,11 +26,11 @@@ import java.util.function.BooleanSuppli
  import java.util.function.Function;
  
  import com.google.common.util.concurrent.RateLimiter;
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
  
 -import org.apache.cassandra.config.DatabaseDescriptor;
 -import org.apache.cassandra.gms.FailureDetector;
 +import org.apache.cassandra.exceptions.RequestFailureReason;
 +import org.apache.cassandra.metrics.HintsServiceMetrics;
  import org.apache.cassandra.net.IAsyncCallbackWithFailure;
  import org.apache.cassandra.net.MessageIn;
  import 

[2/6] cassandra git commit: Interrupt replaying hints on decommission

2017-04-19 Thread jjirsa
Interrupt replaying hints on decommission

Patch by Jeff Jirsa; Reviewed by Aleksey Yeschenko for CASSANDRA-13308


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5089e74e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5089e74e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5089e74e

Branch: refs/heads/cassandra-3.11
Commit: 5089e74ef4a0eaeb1c439d57f074de1c496421f2
Parents: 3110d27
Author: Jeff Jirsa 
Authored: Wed Apr 19 08:26:02 2017 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 19 08:57:45 2017 -0700

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/hints/HintsDispatchExecutor.java|  8 
 src/java/org/apache/cassandra/hints/HintsDispatcher.java |  9 +++--
 src/java/org/apache/cassandra/hints/HintsService.java| 11 ++-
 4 files changed, 22 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 918c46b..e55d4cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,6 @@
 3.0.14
  * Handling partially written hint files (CASSANDRA-12728) 
+ * Interrupt replaying hints on decommission (CASSANDRA-13308)
 
 3.0.13
  * Make reading of range tombstones more reliable (CASSANDRA-12811)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java 
b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
index 333232d..58b30bd 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java
@@ -117,6 +117,14 @@ final class HintsDispatchExecutor
 }
 }
 
+void interruptDispatch(UUID hostId)
+{
+Future future = scheduledDispatches.remove(hostId);
+
+if (null != future)
+future.cancel(true);
+}
+
 private final class TransferHintsTask implements Runnable
 {
 private final HintsCatalog catalog;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsDispatcher.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsDispatcher.java 
b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
index d7a3515..351b3fa 100644
--- a/src/java/org/apache/cassandra/hints/HintsDispatcher.java
+++ b/src/java/org/apache/cassandra/hints/HintsDispatcher.java
@@ -26,6 +26,8 @@ import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.function.Function;
 
 import com.google.common.util.concurrent.RateLimiter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.gms.FailureDetector;
@@ -42,6 +44,8 @@ import org.apache.cassandra.utils.concurrent.SimpleCondition;
  */
 final class HintsDispatcher implements AutoCloseable
 {
+private static final Logger logger = 
LoggerFactory.getLogger(HintsDispatcher.class);
+
 private enum Action { CONTINUE, ABORT }
 
 private final HintsReader reader;
@@ -181,7 +185,7 @@ final class HintsDispatcher implements AutoCloseable
 
 private static final class Callback implements IAsyncCallbackWithFailure
 {
-enum Outcome { SUCCESS, TIMEOUT, FAILURE }
+enum Outcome { SUCCESS, TIMEOUT, FAILURE, INTERRUPTED }
 
 private final long start = System.nanoTime();
 private final SimpleCondition condition = new SimpleCondition();
@@ -198,7 +202,8 @@ final class HintsDispatcher implements AutoCloseable
 }
 catch (InterruptedException e)
 {
-throw new AssertionError(e);
+logger.warn("Hint dispatch was interrupted", e);
+return Outcome.INTERRUPTED;
 }
 
 return timedOut ? Outcome.TIMEOUT : outcome;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5089e74e/src/java/org/apache/cassandra/hints/HintsService.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsService.java 
b/src/java/org/apache/cassandra/hints/HintsService.java
index 5a32786..9cd4ed3 100644
--- a/src/java/org/apache/cassandra/hints/HintsService.java
+++ b/src/java/org/apache/cassandra/hints/HintsService.java
@@ -287,10 +287,11 @@ public final class HintsService implements 
HintsServiceMBean
 /**
  * Cleans up hints-related 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974885#comment-15974885
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/19/17 3:14 PM:
--

No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and hopefully that will create a more acceptable turnaround time. I 
will send an update whenever that is complete.  
https://circleci.com/gh/christian-esken/cassandra/3


was (Author: cesken):
No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and will send an update whenever that is complete. Hopefully that 
will create a more acceptable turnaround time. 
https://circleci.com/gh/christian-esken/cassandra/3

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974885#comment-15974885
 ] 

Christian Esken commented on CASSANDRA-13265:
-

No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and will send an update whenever that is complete. Hopefully that 
will create a more acceptable turnaround time. 
https://circleci.com/gh/christian-esken/cassandra/3

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974873#comment-15974873
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

Exactly.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974834#comment-15974834
 ] 

Jonathan Ellis edited comment on CASSANDRA-12126 at 4/19/17 2:56 PM:
-

I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something [in a linearized system] 
there needs to be a write in between.


was (Author: jbellis):
I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something there needs to be a write in 
between.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974834#comment-15974834
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something there needs to be a write in 
between.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] cassandra git commit: Fix cqlsh automatic protocol downgrade regression Patch by Matt Byrd; reviewed by Mick Semb Wever for CASSANDRA-13307

2017-04-19 Thread mck
Repository: cassandra
Updated Branches:
  refs/heads/trunk e52420624 -> 08c216d12


Fix cqlsh automatic protocol downgrade regression
Patch by Matt Byrd; reviewed by Mick Semb Wever for CASSANDRA-13307


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9c54d02f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9c54d02f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9c54d02f

Branch: refs/heads/trunk
Commit: 9c54d02f73245d3a9a05d37f7d0002421abb852f
Parents: 65c1fdd
Author: Matt Byrd 
Authored: Wed Mar 8 13:55:01 2017 -0800
Committer: Mick Semb Wever 
Committed: Wed Apr 19 16:15:37 2017 +1000

--
 CHANGES.txt  |  1 +
 bin/cqlsh.py | 19 +--
 2 files changed, 14 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c54d02f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 19d8162..1757266 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -24,6 +24,7 @@
  * NoReplicationTokenAllocator should work with zero replication factor 
(CASSANDRA-12983)
  * Address message coalescing regression (CASSANDRA-12676)
  * Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
+ * Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
 Merged from 3.0:
  * Handling partially written hint files (CASSANDRA-12728)
  * Fix NPE issue in StorageService (CASSANDRA-13060)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c54d02f/bin/cqlsh.py
--
diff --git a/bin/cqlsh.py b/bin/cqlsh.py
index 2387342..e765dee 100644
--- a/bin/cqlsh.py
+++ b/bin/cqlsh.py
@@ -178,7 +178,6 @@ from cqlshlib.util import get_file_encoding_bomsize, 
trim_if_present
 DEFAULT_HOST = '127.0.0.1'
 DEFAULT_PORT = 9042
 DEFAULT_SSL = False
-DEFAULT_PROTOCOL_VERSION = 4
 DEFAULT_CONNECT_TIMEOUT_SECONDS = 5
 DEFAULT_REQUEST_TIMEOUT_SECONDS = 10
 
@@ -223,6 +222,9 @@ parser.add_option('--cqlversion', default=None,
   help='Specify a particular CQL version, '
'by default the highest version supported by the server 
will be used.'
' Examples: "3.0.3", "3.1.0"')
+parser.add_option("--protocol-version", type="int", default=None,
+  help='Specify a specific protcol version otherwise the 
client will default and downgrade as necessary')
+
 parser.add_option("-e", "--execute", help='Execute the statement and quit.')
 parser.add_option("--connect-timeout", 
default=DEFAULT_CONNECT_TIMEOUT_SECONDS, dest='connect_timeout',
   help='Specify the connection timeout in seconds (default: 
%default seconds).')
@@ -449,7 +451,7 @@ class Shell(cmd.Cmd):
  ssl=False,
  single_statement=None,
  request_timeout=DEFAULT_REQUEST_TIMEOUT_SECONDS,
- protocol_version=DEFAULT_PROTOCOL_VERSION,
+ protocol_version=None,
  connect_timeout=DEFAULT_CONNECT_TIMEOUT_SECONDS):
 cmd.Cmd.__init__(self, completekey=completekey)
 self.hostname = hostname
@@ -468,13 +470,16 @@ class Shell(cmd.Cmd):
 if use_conn:
 self.conn = use_conn
 else:
+kwargs = {}
+if protocol_version is not None:
+kwargs['protocol_version'] = protocol_version
 self.conn = Cluster(contact_points=(self.hostname,), 
port=self.port, cql_version=cqlver,
-protocol_version=protocol_version,
 auth_provider=self.auth_provider,
 ssl_options=sslhandling.ssl_settings(hostname, 
CONFIG_FILE) if ssl else None,
 
load_balancing_policy=WhiteListRoundRobinPolicy([self.hostname]),
 control_connection_timeout=connect_timeout,
-connect_timeout=connect_timeout)
+connect_timeout=connect_timeout,
+**kwargs)
 self.owns_connection = not use_conn
 
 if keyspace:
@@ -1673,9 +1678,9 @@ class Shell(cmd.Cmd):
 
 direction = parsed.get_binding('dir').upper()
 if direction == 'FROM':
-task = ImportTask(self, ks, table, columns, fname, opts, 
DEFAULT_PROTOCOL_VERSION, CONFIG_FILE)
+task = ImportTask(self, ks, table, columns, fname, opts, 
self.conn.protocol_version, CONFIG_FILE)
 elif direction == 'TO':
-task = ExportTask(self, ks, table, columns, fname, opts, 
DEFAULT_PROTOCOL_VERSION, CONFIG_FILE)
+task = ExportTask(self, ks, table, columns, 

[2/2] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-04-19 Thread mck
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/08c216d1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/08c216d1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/08c216d1

Branch: refs/heads/trunk
Commit: 08c216d125e5c8ed33a3403cde185f4e84d31895
Parents: e524206 9c54d02
Author: Mick Semb Wever 
Authored: Thu Apr 20 00:44:52 2017 +1000
Committer: Mick Semb Wever 
Committed: Thu Apr 20 00:44:52 2017 +1000

--

--




[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974671#comment-15974671
 ] 

Ariel Weisberg commented on CASSANDRA-13265:


Sorry I just had a really busy week last week and I've been trying to get 
Circle to the point it can run the dtests. I'm mostly there it's just a few 
failing tests that remain.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission

2017-04-19 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13308:
--
Status: Ready to Commit  (was: Patch Available)

> Gossip breaks, Hint files not being deleted on nodetool decommission
> 
>
> Key: CASSANDRA-13308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Using Cassandra version 3.0.9
>Reporter: Arijit
>Assignee: Jeff Jirsa
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: 28207.stack, logs, logs_decommissioned_node
>
>
> How to reproduce the issue I'm seeing:
> Shut down Cassandra on one node of the cluster and wait until we accumulate a 
> ton of hints. Start Cassandra on the node and immediately run "nodetool 
> decommission" on it.
> The node streams its replicas and marks itself as DECOMMISSIONED, but other 
> nodes do not seem to see this message. "nodetool status" shows the 
> decommissioned node in state "UL" on all other nodes (it is also present in 
> system.peers), and Cassandra logs show that gossip tasks on nodes are not 
> proceeding (number of pending tasks keeps increasing). Jstack suggests that a 
> gossip task is blocked on hints dispatch (I can provide traces if this is not 
> obvious). Because the cluster is large and there are a lot of hints, this is 
> taking a while. 
> On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint 
> files for the decommissioned node. Documentation seems to suggest that these 
> hints should be deleted during "nodetool decommission", but it does not seem 
> to be the case here. This is the bug being reported.
> To recover from this scenario, if I manually delete hint files on the nodes, 
> the hints dispatcher threads throw a bunch of exceptions and the 
> decommissioned node is now in state "DL" (perhaps it missed some gossip 
> messages?). The node is still in my "system.peers" table
> Restarting Cassandra on all nodes after this step does not fix the issue (the 
> node remains in the peers table). In fact, after this point the 
> decommissioned node is in state "DN"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission

2017-04-19 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974665#comment-15974665
 ] 

Aleksey Yeschenko commented on CASSANDRA-13308:
---

- {{HintsDispatchExecutor.interruptDispatch()}} only uses the {{hostId}} field 
of the passed {{HintsStore}} instance, so we might as well just pass the host 
id directly
- in the same method, you should replace {{scheduledDispatches.get()}} call 
with a call to {{remove()}}, thus eliminating a redundant {{remove()}} later 
down the line.
- no need for the racy {{isDone()}} check either, it doesn't save us anything

So, ultimately, just

{code}
void interruptDispatch(UUID hostId)
{
Future future = scheduledDispatches.remove(hostId);
if (null != future)
future.cancel(true);
}
{code}

should be enough.

But these are nits, can address on commit. LGTM overall, +1.

> Gossip breaks, Hint files not being deleted on nodetool decommission
> 
>
> Key: CASSANDRA-13308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Using Cassandra version 3.0.9
>Reporter: Arijit
>Assignee: Jeff Jirsa
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: 28207.stack, logs, logs_decommissioned_node
>
>
> How to reproduce the issue I'm seeing:
> Shut down Cassandra on one node of the cluster and wait until we accumulate a 
> ton of hints. Start Cassandra on the node and immediately run "nodetool 
> decommission" on it.
> The node streams its replicas and marks itself as DECOMMISSIONED, but other 
> nodes do not seem to see this message. "nodetool status" shows the 
> decommissioned node in state "UL" on all other nodes (it is also present in 
> system.peers), and Cassandra logs show that gossip tasks on nodes are not 
> proceeding (number of pending tasks keeps increasing). Jstack suggests that a 
> gossip task is blocked on hints dispatch (I can provide traces if this is not 
> obvious). Because the cluster is large and there are a lot of hints, this is 
> taking a while. 
> On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint 
> files for the decommissioned node. Documentation seems to suggest that these 
> hints should be deleted during "nodetool decommission", but it does not seem 
> to be the case here. This is the bug being reported.
> To recover from this scenario, if I manually delete hint files on the nodes, 
> the hints dispatcher threads throw a bunch of exceptions and the 
> decommissioned node is now in state "DL" (perhaps it missed some gossip 
> messages?). The node is still in my "system.peers" table
> Restarting Cassandra on all nodes after this step does not fix the issue (the 
> node remains in the peers table). In fact, after this point the 
> decommissioned node is in state "DN"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-04-19 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974643#comment-15974643
 ] 

Paulo Motta commented on CASSANDRA-13397:
-

Tests look good but there was a minor conflict when merging to trunk so I will 
submit a new CI round with the trunk patch:

||trunk||
|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-13397]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13397-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13397-dtest/lastCompletedBuild/testReport/]|


> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13275) Cassandra throws an exception during CQL select query filtering on map key

2017-04-19 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973113#comment-15973113
 ] 

Alex Petrov edited comment on CASSANDRA-13275 at 4/19/17 1:21 PM:
--

Partition key filtering was introduced in [CASSANDRA-11031], although 
{{CONTAINS}} didn't trigger filtering, the read path was trying to convert 
{{CONTAINS}} restriction to bounds.

|[3.11|https://github.com/apache/cassandra/compare/3.11...ifesdjeen:13275-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-3.11-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-3.11-dtest/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13275-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-trunk-dtest/]|
|[dtest 
branch|https://github.com/riptano/cassandra-dtest/compare/master...ifesdjeen:13275-master]|

This is not applicable to 3.0 since we do not allow partition key filtering 
there.


was (Author: ifesdjeen):
Partition key filtering was introduced in [CASSANDRA-11031], although 
{{CONTAINS}} didn't trigger filtering, the read path was trying to convert 
{{CONTAINS}} restriction to bounds.

|[3.11|https://github.com/apache/cassandra/compare/3.11...ifesdjeen:13275-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-3.11-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-3.11-dtest/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13275-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13275-trunk-dtest/]|

> Cassandra throws an exception during CQL select query filtering on map key 
> ---
>
> Key: CASSANDRA-13275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13275
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Abderrahmane CHRAIBI
>Assignee: Alex Petrov
>
> Env: cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4
> Using this table structure:
> {code}CREATE TABLE mytable (
> mymap frozen>> PRIMARY KEY
> )
> {code}
> Executing:
> {code} select * from mytable where mymap contains key UUID;
> {code}
> Within cqlsh shows this message:
> {code}
> ServerError: java.lang.UnsupportedOperationException
> system.log:
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.cql3.restrictions.SingleColumnRestriction$ContainsRestriction.appendTo(SingleColumnRestriction.java:456)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.restrictions.PartitionKeySingleRestrictionSet.values(PartitionKeySingleRestrictionSet.java:86)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:585)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:474)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.getQuery(SelectStatement.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:219) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:204) 
> ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
>  [apache-cassandra-3.9.jar:3.9]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>  

[jira] [Commented] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974597#comment-15974597
 ] 

Sylvain Lebresne commented on CASSANDRA-13462:
--

bq. Is this behaviour documented somewhere?

Well, "somewhere" includes a lot of places, but it's admitedly not in the 
official doc, which is light on details about how the different types compare 
exactly. Contributions to the doc are welcome.

With that said, I don't think that's terribly important because UUID (unless 
they are TimeUUID, in which case they do sort in the most useful way), are 
usually randomly generated and so I'm not sure how they sort matters much.

bq. I think it would be quite a feat to find someone who was relying on this 
behaviour!

To clarify, when I say that changing that would break users, I'm actually not 
talking about user relying on any particular ordering in their result set, even 
though it would indeed break that and is not worth the trouble for that reason 
alone (and btw, the reason the comparator checks the UUID version first, is 
because it sorts time uuids (version 1) by their time component first, which 
can have its uses, so I wouldn't be as confident as you seem to be that no-one 
rely on the current behavior). I'm talking of the fact that data is sorted on 
disk and changing how any type sorts thing is impossible without basically 
destroying existing data.


> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974590#comment-15974590
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. For CHANGES.TXT the entry should go at the top of the list of entries for 
the version the change is for. I don't know why.
I also haven't seen this mentioned. Probably someone could add that to 
https://wiki.apache.org/cassandra/HowToContribute or 
http://cassandra.apache.org/doc/latest/development/how_to_commit.html . Anyhow 
I have fixed that.

bq. set up with CircleCI [...] Also you transposed 13625 and 13265 
I changed the branches to correct the transposing 13625 and 13265. I didn't 
find any other place than the branch names. I will try to find out about how to 
do the CircleCI stuff. Meanwhile here are the updated links:

https://github.com/christian-esken/cassandra/commits/cassandra-13265-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13265-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13265-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13265-trunk

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Andrew Jefferson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974559#comment-15974559
 ] 

Andrew Jefferson commented on CASSANDRA-13462:
--

Thanks for the quick reply - Is this behaviour documented somewhere?

I think it would be quite a feat to find someone who was relying on this 
behaviour!



> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Andrew Jefferson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974557#comment-15974557
 ] 

Andrew Jefferson commented on CASSANDRA-13462:
--

cqlsh> select * from dev.testinguuids where pk=6 ORDER BY ck ;

 pk | ck   | val
+--+-
  6 | 1000--0200-- |   1
  6 | 2000--0200-- |   1
  6 | 1000--1200-- |   1
  6 | 2000--1200-- |   1
  6 | 1000--2200-- |   1
  6 | 2000--2200-- |   1

> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Andrew Jefferson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Jefferson updated CASSANDRA-13462:
-
Comment: was deleted

(was: cqlsh> select * from dev.testinguuids where pk=6 ORDER BY ck ;

 pk | ck   | val
+--+-
  6 | 1000--0200-- |   1
  6 | 2000--0200-- |   1
  6 | 1000--1200-- |   1
  6 | 2000--1200-- |   1
  6 | 1000--2200-- |   1
  6 | 2000--2200-- |   1)

> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-13462.
--
Resolution: Won't Fix

I'm sorry this isn't working as you expected it, but it is the way it works 
(and has been working for years) and we can't change that without breaking 
every user that uses UUID ever which is obviously out of question.

> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Andrew Jefferson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Jefferson updated CASSANDRA-13462:
-
Description: 
My expectation is that UUIDs should behave as 128 bit integers for comparison.

However it seems that the Cassandra implementation compares first the uuid 
version number, then the remaining values of the uuid.

e.g. in C*

 1000--3000--

 is greater than 

2000--1000-- 

(n.b. the 13th value is the uuid version)

 - this is consistent across range queries and using ORDER BY 

  was:
My expectation is that UUIDs should behave as 128 bit integers for comparison.

However it seems that the Cassandra implementation compares first the uuid 
version number, then the remaining values of the uuid.

e.g. in C*
 1000--3000-- is greater than 
2000--1000-- 


I expect range queries / comparisons on UUIDs to work as though this is the 
case. But it does not. It seems to require the UUID to have certain properties 
for range queries to work properly:

```
create table dev.testinguuids ( pk int, ck uuid, val int, PRIMARY KEY ((pk), 
ck) )

insert into dev.testinguuids (pk,ck,val) VALUES (1, 
3000----, 1)

select * from dev.testinguuids where pk=1 and 
ck>1000----;

 -> returns 1 row

select * from dev.testinguuids where pk=1 and 
ck>--5000--;

 - > returns 0 rows
```
after a bit of investigation of UUIDs it works correctly for me if I force my 
query UUIDs to be in the form:

--05xx--

i.e. 
select * from dev.testinguuids where pk=1 and 
ck>--05xx--

works ok

n.b. I have populated my table only with valid type 1 and type 4 uuids. In 
testing if I create uuids that are of the form:
--YYxx-- where YY > 05
then they behave differently as well:


```
# Insert a valid uuid
insert into dev.testinguuids (pk,ck,val) VALUES (2, 
3000----, 1)


insert into dev.testinguuids (pk,ck,val) VALUES (2, 
3000----, 1) 
```


> Unexpected behaviour with range queries on UUIDs
> 
>
> Key: CASSANDRA-13462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Jefferson
>
> My expectation is that UUIDs should behave as 128 bit integers for comparison.
> However it seems that the Cassandra implementation compares first the uuid 
> version number, then the remaining values of the uuid.
> e.g. in C*
>  1000--3000--
>  is greater than 
> 2000--1000-- 
> (n.b. the 13th value is the uuid version)
>  - this is consistent across range queries and using ORDER BY 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13462) Unexpected behaviour with range queries on UUIDs

2017-04-19 Thread Andrew Jefferson (JIRA)
Andrew Jefferson created CASSANDRA-13462:


 Summary: Unexpected behaviour with range queries on UUIDs
 Key: CASSANDRA-13462
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13462
 Project: Cassandra
  Issue Type: Bug
Reporter: Andrew Jefferson


My expectation is that UUIDs should behave as 128 bit integers for comparison.

However it seems that the Cassandra implementation compares first the uuid 
version number, then the remaining values of the uuid.

e.g. in C*
 1000--3000-- is greater than 
2000--1000-- 


I expect range queries / comparisons on UUIDs to work as though this is the 
case. But it does not. It seems to require the UUID to have certain properties 
for range queries to work properly:

```
create table dev.testinguuids ( pk int, ck uuid, val int, PRIMARY KEY ((pk), 
ck) )

insert into dev.testinguuids (pk,ck,val) VALUES (1, 
3000----, 1)

select * from dev.testinguuids where pk=1 and 
ck>1000----;

 -> returns 1 row

select * from dev.testinguuids where pk=1 and 
ck>--5000--;

 - > returns 0 rows
```
after a bit of investigation of UUIDs it works correctly for me if I force my 
query UUIDs to be in the form:

--05xx--

i.e. 
select * from dev.testinguuids where pk=1 and 
ck>--05xx--

works ok

n.b. I have populated my table only with valid type 1 and type 4 uuids. In 
testing if I create uuids that are of the form:
--YYxx-- where YY > 05
then they behave differently as well:


```
# Insert a valid uuid
insert into dev.testinguuids (pk,ck,val) VALUES (2, 
3000----, 1)


insert into dev.testinguuids (pk,ck,val) VALUES (2, 
3000----, 1) 
```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13257) Add repair streaming preview

2017-04-19 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974353#comment-15974353
 ] 

Stefan Podkowinski commented on CASSANDRA-13257:


This is a new feature that should be covered in the docs and NEWS.txt.

> Add repair streaming preview
> 
>
> Key: CASSANDRA-13257
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Streaming and Messaging
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that 
> needs to be done, without actually doing any streaming. Our main motivation 
> for this having something this is validating CASSANDRA-9143 in production, 
> but I’d imagine it could also be a useful tool in troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974342#comment-15974342
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

bq. What is the distinction you are proposing?

Not sure, I think we don't put the same definitions on operation visibility. 
What I'm saying is that "if an operation has a visible outcome, then that 
outcome should be visible (by serial operations) by any subsequent operation 
(so as soon as the operation returns to the client if you will)". In 
particular, if a serial read follows a serial write (meaning that it's started 
after the write returned, even with a timeout), then if the write has any 
effect, the read should see it.

Note that when you get a timeout on the initial write, you don't know if the 
write has been applied or not, but the whole point of a serial read is to be 
able to unequivocally decide what was that outcome. If we can't guarantee that, 
if there is no way to observe if a timed-out write has been applied or not, 
then I'm not sure how one would use LWT in the first place.


> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13257) Add repair streaming preview

2017-04-19 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974337#comment-15974337
 ] 

Marcus Eriksson commented on CASSANDRA-13257:
-

+1

> Add repair streaming preview
> 
>
> Key: CASSANDRA-13257
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Streaming and Messaging
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that 
> needs to be done, without actually doing any streaming. Our main motivation 
> for this having something this is validating CASSANDRA-9143 in production, 
> but I’d imagine it could also be a useful tool in troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13257) Add repair streaming preview

2017-04-19 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13257:

Status: Ready to Commit  (was: Patch Available)

> Add repair streaming preview
> 
>
> Key: CASSANDRA-13257
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Streaming and Messaging
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that 
> needs to be done, without actually doing any streaming. Our main motivation 
> for this having something this is validating CASSANDRA-9143 in production, 
> but I’d imagine it could also be a useful tool in troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13430) Cleanup isIncremental/repairedAt usage

2017-04-19 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974330#comment-15974330
 ] 

Marcus Eriksson commented on CASSANDRA-13430:
-

+1

> Cleanup isIncremental/repairedAt usage
> --
>
> Key: CASSANDRA-13430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13430
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> Post CASSANDRA-9143, there's no longer a reason to pass around 
> {{isIncremental}} or {{repairedAt}} in streaming sessions, as well as some 
> places in repair. The {{pendingRepair}} & {{repairedAt}} values should only 
> be set at the beginning/finalize stages of incremental repair and just follow 
> sstables around as they're streamed. Keeping these values with sstables also 
> fixes an edge case where you could leak repaired data back into unrepaired if 
> you run full and incremental repairs concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13430) Cleanup isIncremental/repairedAt usage

2017-04-19 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13430:

Status: Ready to Commit  (was: Patch Available)

> Cleanup isIncremental/repairedAt usage
> --
>
> Key: CASSANDRA-13430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13430
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> Post CASSANDRA-9143, there's no longer a reason to pass around 
> {{isIncremental}} or {{repairedAt}} in streaming sessions, as well as some 
> places in repair. The {{pendingRepair}} & {{repairedAt}} values should only 
> be set at the beginning/finalize stages of incremental repair and just follow 
> sstables around as they're streamed. Keeping these values with sstables also 
> fixes an edge case where you could leak repaired data back into unrepaired if 
> you run full and incremental repairs concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.

2017-04-19 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-13307:

   Resolution: Fixed
Fix Version/s: 4.0
   Status: Resolved  (was: Ready to Commit)

committed now in both cassandra-3.11 branch and trunk.

> The specification of protocol version in cqlsh means the python driver 
> doesn't automatically downgrade protocol version.
> 
>
> Key: CASSANDRA-13307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Matt Byrd
>Assignee: Matt Byrd
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 4.0, 3.11.x
>
>
> Hi,
> Looks like we've regressed on the issue described in:
> https://issues.apache.org/jira/browse/CASSANDRA-9467
> In that we're no longer able to connect from newer cqlsh versions
> (e.g trunk) to older versions of Cassandra with a lower version of the 
> protocol (e.g 2.1 with protocol version 3)
> The problem seems to be that we're relying on the ability for the client to 
> automatically downgrade protocol version implemented in Cassandra here:
> https://issues.apache.org/jira/browse/CASSANDRA-12838
> and utilised in the python client here:
> https://datastax-oss.atlassian.net/browse/PYTHON-240
> The problem however comes when we implemented:
> https://datastax-oss.atlassian.net/browse/PYTHON-537
> "Don't downgrade protocol version if explicitly set" 
> (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of 
> fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534)
> Since we do explicitly specify the protocol version in the bin/cqlsh.py.
> I've got a patch which just adds an option to explicitly specify the protocol 
> version (for those who want to do that) and then otherwise defaults to not 
> setting the protocol version, i.e using the protocol version from the client 
> which we ship, which should by default be the same protocol as the server.
> Then it should downgrade gracefully as was intended. 
> Let me know if that seems reasonable.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


cassandra git commit: Fix cqlsh automatic protocol downgrade regression Patch by Matt Byrd; reviewed by Mick Semb Wever for CASSANDRA-13307

2017-04-19 Thread mck
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 65c1fddbc -> 9c54d02f7


Fix cqlsh automatic protocol downgrade regression
Patch by Matt Byrd; reviewed by Mick Semb Wever for CASSANDRA-13307


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9c54d02f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9c54d02f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9c54d02f

Branch: refs/heads/cassandra-3.11
Commit: 9c54d02f73245d3a9a05d37f7d0002421abb852f
Parents: 65c1fdd
Author: Matt Byrd 
Authored: Wed Mar 8 13:55:01 2017 -0800
Committer: Mick Semb Wever 
Committed: Wed Apr 19 16:15:37 2017 +1000

--
 CHANGES.txt  |  1 +
 bin/cqlsh.py | 19 +--
 2 files changed, 14 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c54d02f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 19d8162..1757266 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -24,6 +24,7 @@
  * NoReplicationTokenAllocator should work with zero replication factor 
(CASSANDRA-12983)
  * Address message coalescing regression (CASSANDRA-12676)
  * Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
+ * Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
 Merged from 3.0:
  * Handling partially written hint files (CASSANDRA-12728)
  * Fix NPE issue in StorageService (CASSANDRA-13060)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c54d02f/bin/cqlsh.py
--
diff --git a/bin/cqlsh.py b/bin/cqlsh.py
index 2387342..e765dee 100644
--- a/bin/cqlsh.py
+++ b/bin/cqlsh.py
@@ -178,7 +178,6 @@ from cqlshlib.util import get_file_encoding_bomsize, 
trim_if_present
 DEFAULT_HOST = '127.0.0.1'
 DEFAULT_PORT = 9042
 DEFAULT_SSL = False
-DEFAULT_PROTOCOL_VERSION = 4
 DEFAULT_CONNECT_TIMEOUT_SECONDS = 5
 DEFAULT_REQUEST_TIMEOUT_SECONDS = 10
 
@@ -223,6 +222,9 @@ parser.add_option('--cqlversion', default=None,
   help='Specify a particular CQL version, '
'by default the highest version supported by the server 
will be used.'
' Examples: "3.0.3", "3.1.0"')
+parser.add_option("--protocol-version", type="int", default=None,
+  help='Specify a specific protcol version otherwise the 
client will default and downgrade as necessary')
+
 parser.add_option("-e", "--execute", help='Execute the statement and quit.')
 parser.add_option("--connect-timeout", 
default=DEFAULT_CONNECT_TIMEOUT_SECONDS, dest='connect_timeout',
   help='Specify the connection timeout in seconds (default: 
%default seconds).')
@@ -449,7 +451,7 @@ class Shell(cmd.Cmd):
  ssl=False,
  single_statement=None,
  request_timeout=DEFAULT_REQUEST_TIMEOUT_SECONDS,
- protocol_version=DEFAULT_PROTOCOL_VERSION,
+ protocol_version=None,
  connect_timeout=DEFAULT_CONNECT_TIMEOUT_SECONDS):
 cmd.Cmd.__init__(self, completekey=completekey)
 self.hostname = hostname
@@ -468,13 +470,16 @@ class Shell(cmd.Cmd):
 if use_conn:
 self.conn = use_conn
 else:
+kwargs = {}
+if protocol_version is not None:
+kwargs['protocol_version'] = protocol_version
 self.conn = Cluster(contact_points=(self.hostname,), 
port=self.port, cql_version=cqlver,
-protocol_version=protocol_version,
 auth_provider=self.auth_provider,
 ssl_options=sslhandling.ssl_settings(hostname, 
CONFIG_FILE) if ssl else None,
 
load_balancing_policy=WhiteListRoundRobinPolicy([self.hostname]),
 control_connection_timeout=connect_timeout,
-connect_timeout=connect_timeout)
+connect_timeout=connect_timeout,
+**kwargs)
 self.owns_connection = not use_conn
 
 if keyspace:
@@ -1673,9 +1678,9 @@ class Shell(cmd.Cmd):
 
 direction = parsed.get_binding('dir').upper()
 if direction == 'FROM':
-task = ImportTask(self, ks, table, columns, fname, opts, 
DEFAULT_PROTOCOL_VERSION, CONFIG_FILE)
+task = ImportTask(self, ks, table, columns, fname, opts, 
self.conn.protocol_version, CONFIG_FILE)
 elif direction == 'TO':
-task = ExportTask(self, ks, table, columns, fname, opts, 
DEFAULT_PROTOCOL_VERSION, CONFIG_FILE)
+task = ExportTask(self, ks, 

[jira] [Commented] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.

2017-04-19 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974141#comment-15974141
 ] 

mck commented on CASSANDRA-13307:
-


|| Branch || Unit Tests || DTests ||
| 
[3.11.x|https://github.com/michaelsembwever/cassandra/commit/32835b0919c5d89b565f0adff15a845fe392c270]
 | [circleci|https://circleci.com/gh/michaelsembwever/cassandra/14] | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/24]
 |
| 
[trunk|https://github.com/apache/cassandra/pull/96/commits/c36a4e5547af3967976144f7b553d70873503f77]
 | [asf 
jenkins|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/3]
 \\ [circleci|https://circleci.com/gh/michaelsembwever/cassandra/3] | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/15/]
 |


> The specification of protocol version in cqlsh means the python driver 
> doesn't automatically downgrade protocol version.
> 
>
> Key: CASSANDRA-13307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Matt Byrd
>Assignee: Matt Byrd
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 3.11.x
>
>
> Hi,
> Looks like we've regressed on the issue described in:
> https://issues.apache.org/jira/browse/CASSANDRA-9467
> In that we're no longer able to connect from newer cqlsh versions
> (e.g trunk) to older versions of Cassandra with a lower version of the 
> protocol (e.g 2.1 with protocol version 3)
> The problem seems to be that we're relying on the ability for the client to 
> automatically downgrade protocol version implemented in Cassandra here:
> https://issues.apache.org/jira/browse/CASSANDRA-12838
> and utilised in the python client here:
> https://datastax-oss.atlassian.net/browse/PYTHON-240
> The problem however comes when we implemented:
> https://datastax-oss.atlassian.net/browse/PYTHON-537
> "Don't downgrade protocol version if explicitly set" 
> (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of 
> fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534)
> Since we do explicitly specify the protocol version in the bin/cqlsh.py.
> I've got a patch which just adds an option to explicitly specify the protocol 
> version (for those who want to do that) and then otherwise defaults to not 
> setting the protocol version, i.e using the protocol version from the client 
> which we ship, which should by default be the same protocol as the server.
> Then it should downgrade gracefully as was intended. 
> Let me know if that seems reasonable.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)