[ 
https://issues.apache.org/jira/browse/IGNITE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102136#comment-17102136
 ] 

Andrey N. Gura commented on IGNITE-12982:
-----------------------------------------

[~mmuzaf] We never assign {{null}} to the {{metricsLsnr}} field, so it is just 
impossible that node worked correctly and problem arrives on shutdown only. 

I see a race between {{onMessageSent}} method invocation from NIO thread (NIO 
server is already started at this moment in {{spiStart}} method) and 
{{metricsLsnr}} field initialization (see 
{{TcpCommunicationSpi.onContextInitialization0}} method). But it should be 
checked.

Please, share your findings if you'll investigate this. It could be tricky to 
solve all possible problems here.

> NullPointerException on TcpCommunicationMetricsListener for some of the cases
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-12982
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12982
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Maxim Muzafarov
>            Priority: Major
>             Fix For: 2.9
>
>
> The code block below throws an {{NullPointerException}} for some of the 
> cases. Investigation required.
> {code}
>             @Override public void onMessageSent(GridNioSession ses, Message 
> msg) {
>                 Object consistentId = ses.meta(CONSISTENT_ID_META);
>                 if (consistentId != null)
>                     metricsLsnr.onMessageSent(msg, consistentId);
>             }
> {code}
> {code}
> [2020-05-04 
> 18:12:12,991][ERROR][grid-nio-worker-tcp-comm-0-#543%snapshot.IgniteClusterSnapshotSelfTest2%][TestRecordingCommunicationSpi]
>  Failed to process selector key [ses=GridSelectorNioSessionImpl 
> [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, 
> bytesRcvd=42, bytesSent=18, bytesRcvd0=42, bytesSent0=18, select=true, 
> super=GridWorker [name=grid-nio-worker-tcp-comm-0, 
> igniteInstanceName=snapshot.IgniteClusterSnapshotSelfTest2, finished=false, 
> heartbeatTs=1588605131981, hashCode=1038334332, interrupted=false, 
> runner=grid-nio-worker-tcp-comm-0-#543%snapshot.IgniteClusterSnapshotSelfTest2%]]],
>  writeBuf=java.nio.DirectByteBuffer[pos=10 lim=32768 cap=32768], 
> readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
> inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, 
> sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=7f78d082-6ce9-42b1-ab08-da1fde400000, 
> consistentId=snapshot.IgniteClusterSnapshotSelfTest0, addrs=ArrayList 
> [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1588605131971, loc=false, 
> ver=2.9.0#20200428-sha1:e551fa71, isClient=false], connected=true, 
> connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], 
> outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, 
> sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=7f78d082-6ce9-42b1-ab08-da1fde400000, 
> consistentId=snapshot.IgniteClusterSnapshotSelfTest0, addrs=ArrayList 
> [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1588605131971, loc=false, 
> ver=2.9.0#20200428-sha1:e551fa71, isClient=false], connected=true, 
> connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], 
> closeSocket=true, 
> outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1,
>  super=GridNioSessionImpl [locAddr=/127.0.0.1:47102, 
> rmtAddr=/127.0.0.1:50655, createTime=1588605131981, closeTime=0, 
> bytesSent=18, bytesRcvd=42, bytesSent0=18, bytesRcvd0=42, 
> sndSchedTime=1588605131981, lastSndTime=1588605131981, 
> lastRcvTime=1588605131981, readsPaused=false, 
> filterChain=FilterChain[filters=[GridNioCodecFilter 
> [parser=o.a.i.i.util.nio.GridDirectParser@fc19b0b, directMode=true], 
> GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]]
> java.lang.NullPointerException
>       at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:803)
>       at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:472)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer.onMessageWritten(GridNioServer.java:1764)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer.access$1800(GridNioServer.java:99)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1665)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1365)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2437)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2201)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1842)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at java.lang.Thread.run(Thread.java:748)
> [2020-05-04 
> 18:12:12,993][ERROR][grid-nio-worker-tcp-comm-0-#543%snapshot.IgniteClusterSnapshotSelfTest2%][TestRecordingCommunicationSpi]
>  Closing NIO session because of unhandled exception.
> class org.apache.ignite.internal.util.nio.GridNioException: null
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2460)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2201)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1842)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:803)
>       at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:472)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer.onMessageWritten(GridNioServer.java:1764)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer.access$1800(GridNioServer.java:99)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1665)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1365)
>       at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2437)
>       ... 4 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to