Hi,

Re-post message 'cause I failed to post my logs pasted.

I have got repeated Too many open files exceptions since sometime.
================================
[11:26:24,493][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Failed to process selector key [ses=GridSelectorNioSessionImpl
[worker=ByteBufferNioClientWorker
[readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
finished=false, hashCode=1611196193, interrupted=false,
runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
inRecovery=null, outRecovery=null, super=GridNioSessionImpl
[locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
createTime=1529666783471, closeTime=0, bytesSent=5, bytesRcvd=1074,
bytesSent0=0, bytesRcvd0=0, sndSchedTime=1529666783481,
lastSndTime=1529666783481, lastRcvTime=1529666783481,
readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=GridTcpRestParser [marsh=JdkMarshaller
[clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
directMode=false]], accepted=true]]]
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1085)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2339)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2110)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
[11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Closing NIO session because of unhandled exception [cls=class
o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer]
[11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Closed client session due to exception [ses=GridSelectorNioSessionImpl
[worker=ByteBufferNioClientWorker
[readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
finished=false, hashCode=1611196193, interrupted=false,
runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
inRecovery=null, outRecovery=null, super=GridNioSessionImpl
[locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
createTime=1529666783471, closeTime=1529666784488, bytesSent=5,
bytesRcvd=1074, bytesSent0=0, bytesRcvd0=0,
sndSchedTime=1529666783481, lastSndTime=1529666783481,
lastRcvTime=1529666783481, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=GridTcpRestParser [marsh=JdkMarshaller
[clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
directMode=false]], accepted=true]], msg=Connection reset by peer]
[11:26:24,513][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Caught unhandled exception in NIO worker thread (restart the node).
java.lang.NullPointerException
        at 
sun.nio.ch.EPollArrayWrapper.isEventsHighKilled(EPollArrayWrapper.java:174)
        at 
sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:190)
        at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:239)
        at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:178)
        at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:132)
        at 
java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:212)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.register(GridNioServer.java:2545)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1934)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
[11:26:30,277][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
to accept remote connection (will wait for 2000ms).
class org.apache.ignite.IgniteCheckedException: Failed to accept
connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
finished=false, hashCode=1020662787, interrupted=false,
runner=nio-acceptor-#55]
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
        ... 3 more
[11:26:32,284][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
to accept remote connection (will wait for 2000ms).
class org.apache.ignite.IgniteCheckedException: Failed to accept
connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
finished=false, hashCode=1020662787, interrupted=false,
runner=nio-acceptor-#55]
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
        at 
org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
        ... 3 more
================================

My max open files is 32768, and ignite process does have 32768 open files.
================================
$ sudo ls -hl /proc/4055/fd/ | wc -l
32768
================================

Most of them look like this
================================
...
lrwx------ 1 root root 64 Jun 23 12:22 9990 -> socket:[1167798]
lrwx------ 1 root root 64 Jun 23 12:22 9991 -> socket:[1167799]
lrwx------ 1 root root 64 Jun 23 12:22 9992 -> socket:[1166839]
lrwx------ 1 root root 64 Jun 23 12:22 9993 -> socket:[1167800]
lrwx------ 1 root root 64 Jun 23 12:22 9994 -> socket:[1168762]
lrwx------ 1 root root 64 Jun 23 12:22 9995 -> socket:[1168763]
lrwx------ 1 root root 64 Jun 23 12:22 9996 -> socket:[1164109]
lrwx------ 1 root root 64 Jun 23 12:22 9997 -> socket:[1166840]
lrwx------ 1 root root 64 Jun 23 12:22 9998 -> socket:[1164110]
lrwx------ 1 root root 64 Jun 23 12:22 9999 -> socket:[1169810]
================================

I haven't found any document about how ignite uses unix socket.
It seems ignite doesn't close them properly. Any help?

Thanks.

Reply via email to