[jira] [Commented] (CASSANDRA-11744) Trying to restart a 2.2.5 node, nodetool disablethrift fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342637#comment-15342637 ] Peter Norton commented on CASSANDRA-11744: -- This happened again on 2.2.6 today. > Trying to restart a 2.2.5 node, nodetool disablethrift fails > > > Key: CASSANDRA-11744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11744 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Norton > Attachments: failure.jstack.out > > > We have a 2.2.5 cluster running in AWS VPC with EBS volumes. Earlier today 3 > nodes seem to have gone into a bad state - clients were seeing high latencies > when writing to these nodes, and the write to the commitlog on each of these > nodes seemed high - more than the relatively low number of iops that AWS > allocated to these volumes. While trying to understand the situation we > attempted to restart the 3 nodes. We attempted to do a nodetool > disablebinary; nodetool disablethrift; nodetool flush. and then stop the > process. > When trying to disablethrift, the following stack trace appeared in the > system.log: > ``` > INFO [RMI TCP Connection(8)-172.26.32.248] 2016-05-10 15:26:58,599 > Server.java:218 - Stop listening for CQL clients > INFO [RMI TCP Connection(10)-172.26.32.248] 2016-05-10 15:27:01,975 > ThriftServer.java:142 - Stop listening to thrift clients > ERROR [RPC-Thread:34] 2016-05-10 15:27:03,794 Message.java:324 - Unexpected > throwable while invoking! > java.lang.NullPointerException: null > at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) > ~[thrift-server-0.3.7.jar:na] > at > org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[libthrift-0.9.2.jar:0.9.2] > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > [thrift-server-0.3.7.jar:na] > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > [disruptor-3.0.1.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_60] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_60] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] > ``` > The attached jstack was taken from a node after the above was noticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11744) Trying to restart a 2.2.5 node, nodetool disablethrift fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Norton updated CASSANDRA-11744: - Since Version: 2.0.11 (was: 2.2.5) > Trying to restart a 2.2.5 node, nodetool disablethrift fails > > > Key: CASSANDRA-11744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11744 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Norton > Attachments: failure.jstack.out > > > We have a 2.2.5 cluster running in AWS VPC with EBS volumes. Earlier today 3 > nodes seem to have gone into a bad state - clients were seeing high latencies > when writing to these nodes, and the write to the commitlog on each of these > nodes seemed high - more than the relatively low number of iops that AWS > allocated to these volumes. While trying to understand the situation we > attempted to restart the 3 nodes. We attempted to do a nodetool > disablebinary; nodetool disablethrift; nodetool flush. and then stop the > process. > When trying to disablethrift, the following stack trace appeared in the > system.log: > ``` > INFO [RMI TCP Connection(8)-172.26.32.248] 2016-05-10 15:26:58,599 > Server.java:218 - Stop listening for CQL clients > INFO [RMI TCP Connection(10)-172.26.32.248] 2016-05-10 15:27:01,975 > ThriftServer.java:142 - Stop listening to thrift clients > ERROR [RPC-Thread:34] 2016-05-10 15:27:03,794 Message.java:324 - Unexpected > throwable while invoking! > java.lang.NullPointerException: null > at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) > ~[thrift-server-0.3.7.jar:na] > at > org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[libthrift-0.9.2.jar:0.9.2] > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > [thrift-server-0.3.7.jar:na] > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > [disruptor-3.0.1.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_60] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_60] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] > ``` > The attached jstack was taken from a node after the above was noticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11744) Trying to restart a 2.2.5 node, nodetool disablethrift fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296418#comment-15296418 ] Peter Norton commented on CASSANDRA-11744: -- A similar error just happened on 2.0.11 as well. > Trying to restart a 2.2.5 node, nodetool disablethrift fails > > > Key: CASSANDRA-11744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11744 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Norton > Attachments: failure.jstack.out > > > We have a 2.2.5 cluster running in AWS VPC with EBS volumes. Earlier today 3 > nodes seem to have gone into a bad state - clients were seeing high latencies > when writing to these nodes, and the write to the commitlog on each of these > nodes seemed high - more than the relatively low number of iops that AWS > allocated to these volumes. While trying to understand the situation we > attempted to restart the 3 nodes. We attempted to do a nodetool > disablebinary; nodetool disablethrift; nodetool flush. and then stop the > process. > When trying to disablethrift, the following stack trace appeared in the > system.log: > ``` > INFO [RMI TCP Connection(8)-172.26.32.248] 2016-05-10 15:26:58,599 > Server.java:218 - Stop listening for CQL clients > INFO [RMI TCP Connection(10)-172.26.32.248] 2016-05-10 15:27:01,975 > ThriftServer.java:142 - Stop listening to thrift clients > ERROR [RPC-Thread:34] 2016-05-10 15:27:03,794 Message.java:324 - Unexpected > throwable while invoking! > java.lang.NullPointerException: null > at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) > ~[thrift-server-0.3.7.jar:na] > at > org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) > ~[libthrift-0.9.2.jar:0.9.2] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[libthrift-0.9.2.jar:0.9.2] > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > ~[thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > [thrift-server-0.3.7.jar:na] > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > [thrift-server-0.3.7.jar:na] > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > [disruptor-3.0.1.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_60] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_60] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] > ``` > The attached jstack was taken from a node after the above was noticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11744) Trying to restart a 2.2.5 node, nodetool disablethrift fails
Peter Norton created CASSANDRA-11744: Summary: Trying to restart a 2.2.5 node, nodetool disablethrift fails Key: CASSANDRA-11744 URL: https://issues.apache.org/jira/browse/CASSANDRA-11744 Project: Cassandra Issue Type: Bug Reporter: Peter Norton Attachments: failure.jstack.out We have a 2.2.5 cluster running in AWS VPC with EBS volumes. Earlier today 3 nodes seem to have gone into a bad state - clients were seeing high latencies when writing to these nodes, and the write to the commitlog on each of these nodes seemed high - more than the relatively low number of iops that AWS allocated to these volumes. While trying to understand the situation we attempted to restart the 3 nodes. We attempted to do a nodetool disablebinary; nodetool disablethrift; nodetool flush. and then stop the process. When trying to disablethrift, the following stack trace appeared in the system.log: ``` INFO [RMI TCP Connection(8)-172.26.32.248] 2016-05-10 15:26:58,599 Server.java:218 - Stop listening for CQL clients INFO [RMI TCP Connection(10)-172.26.32.248] 2016-05-10 15:27:01,975 ThriftServer.java:142 - Stop listening to thrift clients ERROR [RPC-Thread:34] 2016-05-10 15:27:03,794 Message.java:324 - Unexpected throwable while invoking! java.lang.NullPointerException: null at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) ~[thrift-server-0.3.7.jar:na] at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) ~[thrift-server-0.3.7.jar:na] at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) ~[thrift-server-0.3.7.jar:na] at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.2.jar:0.9.2] at com.thinkaurelius.thrift.Message.invoke(Message.java:314) ~[thrift-server-0.3.7.jar:na] at com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) [thrift-server-0.3.7.jar:na] at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) [thrift-server-0.3.7.jar:na] at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) [thrift-server-0.3.7.jar:na] at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) [disruptor-3.0.1.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] ``` The attached jstack was taken from a node after the above was noticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)