Hi, We have an ignite cluster setup with two ignite servers. At certain times during the week, we get these error messages in a sequence that we believe is causing the JVM Memory size to increase. We have 2gb xmx and xms set using jdk 11. Ignite version used is 2.8.0. We know 2gb is very small but we believe increasing the heap size allocation is not going to solve the issue. The exact stack trace is
/Mar 02, 2021 1:45:20 AM org.apache.ignite.logger.java.JavaLogger error SEVERE: Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192], super=AbstractNioClientWorker [idx=3, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-client-listener-3, igniteInstanceName=null, finished=false, heartbeatTs=1614667518323, hashCode=92764489, interrupted=false, runner=grid-nio-worker-client-listener-3-#133]]], writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null, closeSocket=true, outboundMessagesQueueSizeMetric=null, super=GridNioSessionImpl [locAddr=/x.x.x.x:x, rmtAddr=/x.x.x.x:x, createTime=1614667512243, closeTime=0, bytesSent=0, bytesRcvd=517, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1614667512243, lastSndTime=1614667512243, lastRcvTime=1614667512273, readsPaused=false, filterChain=FilterChain[filters=[GridNioAsyncNotifyFilter, GridNioCodecFilter [parser=ClientListenerBufferedParser, directMode=false]], accepted=true, markedForClose=false]]] java.io.IOException: Connection reset by peer at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276) at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:245) at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223) at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:358) at org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1162) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2449) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2216) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1857) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.base/java.lang.Thread.run(Thread.java:834)/ The server crashes with JAVA OOM and upon looking at the .hprof file analyzing the biggest objects at the time of OOM, we saw this, <http://apache-ignite-users.70518.x6.nabble.com/file/t3087/highheapmem.png> It looks like just the ClientListenerNioServerBuffer is consuming 1GB of memory at the time of crash. Shouldn't this buffer cleared when there is any issue with NC's. Other threads suggest increasing the socket timeout or reducing the failure detection timeout. Although, I will try them out, I am skeptical that those fixes will work. Any help is appreciated! Thanks! -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
