[ https://issues.apache.org/jira/browse/CASSANDRA-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354774#comment-15354774 ]
peng xiao commented on CASSANDRA-12103: --------------------------------------- Sam, one more thing is that we are not able to start the ops agent with the following error. the problem is that we still have enough memory.Could you please take a look? Thanks # free -m total used free shared buffers cached Mem: 128950 127116 1834 2 349 111513 -/+ buffers/cache: 15253 113697 Swap: 4095 0 4095 top - 16:14:36 up 236 days, 20:27, 2 users, load average: 0.06, 0.06, 0.06 Tasks: 612 total, 2 running, 610 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 132045660k total, 130216856k used, 1828804k free, 357384k buffers Swap: 4194300k total, 0k used, 4194300k free, 114239300k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 44474 root 20 0 28.0g 11g 723m S 7.6 9.2 669:12.36 java 38489 root 20 0 593m 41m 2900 S 0.0 0.0 370:51.19 python 32612 root 20 0 159m 27m 2920 S 0.0 0.0 0:00.16 vim Caused by: java.lang.OutOfMemoryError: Java heap space ERROR [os-metrics-7] 2016-06-29 12:06:57,082 DataStax agent ran out of memory! Shutting down! ERROR [os-metrics-8] 2016-06-29 12:06:57,082 DataStax agent ran out of memory! Shutting down! ERROR [node-details-3] 2016-06-29 12:06:57,082 DataStax agent ran out of memory! Shutting down! ERROR [jmx-metrics-1] 2016-06-29 12:06:57,083 Error updating cf list: #<OutOfMemoryError java.lang.OutOfMemoryError: Java heap space> ERROR [install-location-finder] 2016-06-29 12:06:57,083 Uncaught exception on install-location-finder java.lang.OutOfMemoryError: Java heap space at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2688) at java.lang.Class.getDeclaredMethod(Class.java:2115) at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1431) at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:494) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at sun.rmi.server.UnicastRef.unmarshalValue(UnicastRef.java:326) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:175) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttributes(Unknown Source) at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttributes(RMIConnector.java:931) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) at clojure.java.jmx$raw_read.invoke(jmx.clj:216) at clojure.core$comp$fn__4154.invoke(core.clj:2332) at opsagent.jmx$get_attributes_with_default.invoke(jmx.clj:44) ERROR [node-details-2] 2016-06-29 12:06:57,083 Error: #<OutOfMemoryError java.lang.OutOfMemoryError: Java heap space> > Cassandra is hang and cqlsh was not able to login with OperationTimeout error > ----------------------------------------------------------------------------- > > Key: CASSANDRA-12103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12103 > Project: Cassandra > Issue Type: Bug > Components: Core, Local Write-Read Paths > Environment: centos 6.5 cassandra 2.1.9 > Reporter: peng xiao > Priority: Critical > Attachments: system.log.2016-06-28_1257.gz > > > Hi, > We have two DCs(DC1 and DC2) with DC1 3 nodes and DC2 9 nodes. > And we experienced a Timeout error today,all applications connected to DC1 > were hang and no response,even cqlsh was not able to log into any node in DC1. > I restarted the 3 nodes in DC1,the problem was not resolved. > Then we switched to DC2,then applications back to normal. > Could you please help to take a look? > Thanks > many errors like below: > ERROR [SharedPool-Worker-43] 2016-06-28 11:58:49,705 Message.java:538 - > Unexpected exception during request; channel = [id: 0x87e315d6, > /172.16.10.198:13604 => /172.16.11.13:9042] > java.lang.RuntimeException: > org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - > received only 0 responses. > at org.apache.cassandra.auth.Auth.selectUser(Auth.java:276) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:86) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.service.ClientState.login(ClientState.java:206) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:82) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439) > [apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335) > [apache-cassandra-2.1.9.jar:2.1.9] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0] > at > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) > [apache-cassandra-2.1.9.jar:2.1.9] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [apache-cassandra-2.1.9.jar:2.1.9] > at java.lang.Thread.run(Thread.java:744) [na:1.8.0] -- This message was sent by Atlassian JIRA (v6.3.4#6332)