Ignite visor timeout when calling node command on thick client in Kubernets cluster.

John Smith Thu, 20 Jul 2023 18:17:27 -0700

So the client is exposed as node ports and I have been able to provide the
proper ports back to the client and cluster...


When I look at the node details I see...

| Address (0)                 | 10.xxx.xxx.xxx                        |
<---- Kubernetes internal I.P
| Address (1)                 | 127.0.0.1                                |

So it only knows the 2 addresses but  somehow the timeout is aware of the
3rd address see below....

addrs=[/10.xxx.xxx.xxx:47100, /127.0.0.1:47100, /172.xxx.xxx.xxx:30524]]
<----- 172 Is where the thick client is exposed as node port. So it somehow
knows it?

Even on the client I can see ignite visor connected

Completed partition exchange
[localNode=c9b86d24-0f0d-4198-98c5-59ce677669f8,
exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=434, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=c16d4ff0-e37e-4a18-a2ae-ec770ad25c39,
consistentId=0:0:0:0:0:0:0:1%lo,127.0.0.1,172.xxx.xxx.xxx:47500,
addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.xxx.xxx.xxx],
sockAddrs=HashSet [0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500,
xxxxxx-visor-0001/172.xxx.xxx.xxx:47500], discPort=47500, order=434,
intOrder=227, lastExchangeTime=1689901276893, loc=false,
ver=2.12.0#20220108-sha1:b1289f75, isClient=false], rebalanced=true,
done=true, newCrdFut=null], topVer=AffinityTopologyVersion [topVer=434,
minorTopVer=0]]
AffinityTopologyVersion [topVer=434, minorTopVer=0], evt=NODE_JOINED,
evtNode=c16d4ff0-e37e-4a18-a2ae-ec770ad25c39, client=true]

On ignite visor we see the error below

[00:29:07,053][SEVERE][main][TcpCommunicationSpi] Failed to send message to
remote node [node=TcpDiscoveryNode
[id=c9b86d24-0f0d-4198-98c5-59ce677669f8,
consistentId=c9b86d24-0f0d-4198-98c5-59ce677669f8, addrs=ArrayList
[10.xxx.xxx.xxx, 127.0.0.1], sockAddrs=HashSet [/10.xxx.xxx.xxx:0, /
127.0.0.1:0], discPort=0, order=429, intOrder=224,
lastExchangeTime=1689899260716, loc=false,
ver=2.12.0#20220108-sha1:b1289f75, isClient=true], msg=GridIoMessage
[plc=3, topic=TOPIC_JOB, topicOrd=0, ordered=false, timeout=0,
skipOnTimeout=false, msg=GridJobExecuteRequest
[sesId=fd758d57981-a43b0db8-3b02-4506-ac69-412e46736682,
jobId=0e758d57981-a43b0db8-3b02-4506-ac69-412e46736682,
startTaskTime=1689899286905, timeout=9223372036854775807,
taskName=org.apache.ignite.internal.visor.node.VisorNodeDataCollectorTask,
userVer=0,
taskClsName=org.apache.ignite.internal.visor.node.VisorNodeDataCollectorTask,
ldrParticipants=null, cpSpi=null, createTime=1689899286989,
clsLdrId=90558d57981-a43b0db8-3b02-4506-ac69-412e46736682,
depMode=ISOLATED, dynamicSiblings=false, forceLocDep=true,
sesFullSup=false, internal=true, topPred=null, part=-1, topVer=null,
execName=null]]]
class org.apache.ignite.IgniteCheckedException: Failed to connect to node
(is node still alive?). Make sure that each ComputeTask and cache
Transaction has a timeout set in order to prevent parties from waiting
forever in case of network issues
[nodeId=c9b86d24-0f0d-4198-98c5-59ce677669f8, addrs=[/10.xxx.xxx.xxx:47100,
/127.0.0.1:47100, /172.xxx.xxx.xxx:30524]]

Ignite visor timeout when calling node command on thick client in Kubernets cluster.

Reply via email to