Semyon Danilov created IGNITE-14448: ---------------------------------------
Summary: Failure to connect to node leads to hanging connection future if paired connections are used Key: IGNITE-14448 URL: https://issues.apache.org/jira/browse/IGNITE-14448 Project: Ignite Issue Type: Bug Components: networking Affects Versions: 2.10 Reporter: Semyon Danilov Assignee: Semyon Danilov {{if ((CommunicationSpi<?>)spi instanceof TcpCommunicationSpi) getTcpCommunicationSpi().setConnectionRequestor(invConnHandler); if (connRequestor != null) { ... if (isPairedConnection(node, tcpCommSpi)) throw new IgniteSpiException("Inverse connection protocol doesn't support paired connections");}} Turns out this exception is not handled property and connection future is never done. Then, striped pool threads wait forever on reserveClient() and cluster grinds to halt. This happens in versions which have communication-via-discovery and when usePairedConnections=true. {{[12:06:18,110][SEVERE][sys-stripe-0-#1][TcpCommunicationSpi] Failed to send message to remote node [node=TcpDiscoveryNode [id=54ddcf8b-3e41-4efe-bb9d-8a0369e7b893, consistentId=54ddcf8b-3e4 1-4efe-bb9d-8a0369e7b893, addrs=ArrayList [127.0.0.1, 172.22.229.21], sockAddrs=HashSet [/127.0.0.1:0, ip-172-22-229-21.ec2.internal/172.22.229.21:0], discPort=0, order=47, intOrder=47, lastExchangeTime=1603983940522, loc=false, ver=8.7.25#20200910-sha1:b580d9fd, isClient=true], msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=f alse, msg=GridDhtAtomicSingleUpdateRequest [key=KeyCacheObjectImpl [part=24, val=23576, hasValBytes=true], val=com.dream11.ignite.model.GetRoundSummaryRes [idHash=69226443, hash=580815760,roundId=23576, dataSource=MYSQL, sparkJobStatus=COMPLETED], prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false, nearNodeId=null, nearFutId=0, flags=near]], connIdx=-1]] class org.apache.ignite.spi.IgniteSpiException: Inverse connection protocol doesn't support paired connections at org.apache.ignite.internal.managers.communication.GridIoManager$TcpCommunicationInverseConnectionHandler.request(GridIoManager.java:3564) at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.handleUnreachableNodeException(ConnectionClientPool.java:365) at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:256) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1132) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1083) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1814) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1930) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture}} -- This message was sent by Atlassian Jira (v8.3.4#803005)