Hi list!
I started to use Ignite (1.1.0-incubating) for a network message bus where I
have a server node and several client nodes using the TcpClientDiscoverySpi.
On first startup, it does not matter in which order I start my Ignite sever or
client. Each other waits as expected to have the grid running. But if I kill
the server, the client fails to reconnect to it.
I have a simple test for this.
Just start the class TestIgniteClient with program parameter „s“ and again with
„c“ so you have two instances running. Then you should see the message „I am
here“ flowing from the server to the client.
Once you kill the server process („s“) and restart it, you will get a lot of
exception on the client and it will not reconnect.
Is there something I do wrong, or should I file a JIRA about that?
Thanks for your help.
===exception==
SCHWERWIEGEND: Failed to refresh partition map
[oldest=--0001--0001, rmts=[],
loc=445a949f-dd26-4998-8c1c-4faa05ceed81]
class org.apache.ignite.IgniteCheckedException: Failed to send message (node
may have left the grid or TCP connection cannot be established due to firewall
issues) [node=TcpDiscoveryNode [id=--0001--0001,
addrs=[10.0.0.102, 0:0:0:0:0:0:0:1, 127.0.0.1], sockAddrs=[/10.0.0.102:8025,
/0:0:0:0:0:0:0:1:8025, /127.0.0.1:8025], discPort=8025, order=1, intOrder=1,
loc=false, ver=1.1.0#20150520-sha1:6da491f4, isClient=false],
topic=TOPIC_CACHE, msg=GridDhtPartitionsSingleMessage
[parts={-2100569601=GridDhtPartitionMap
[nodeId=445a949f-dd26-4998-8c1c-4faa05ceed81, updateSeq=4, size=0],
689859866=GridDhtPartitionMap [nodeId=445a949f-dd26-4998-8c1c-4faa05ceed81,
updateSeq=4, size=0], 1325947219=GridDhtPartitionMap
[nodeId=445a949f-dd26-4998-8c1c-4faa05ceed81, updateSeq=4, size=0]},
super=GridDhtPartitionsAbstractMessage [exchId=null, lastVer=GridCacheVersion
[topVer=0, nodeOrderDrId=0, globalTime=0, order=1434179630276],
super=GridCacheMessage [msgId=4, depInfo=null, err=null, skipPrepare=false]]],
policy=SYSTEM_POOL]
at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:952)
at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1016)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:389)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.sendLocalPartitions(GridCachePartitionExchangeManager.java:664)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.refreshPartitions(GridCachePartitionExchangeManager.java:579)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.refreshPartitions(GridCachePartitionExchangeManager.java:603)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1700(GridCachePartitionExchangeManager.java:57)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:967)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:108)
at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to send
message to remote node: TcpDiscoveryNode
[id=--0001--0001, addrs=[10.0.0.102, 0:0:0:0:0:0:0:1,
127.0.0.1], sockAddrs=[/10.0.0.102:8025, /0:0:0:0:0:0:0:1:8025,
/127.0.0.1:8025], discPort=8025, order=1, intOrder=1, loc=false,
ver=1.1.0#20150520-sha1:6da491f4, isClient=false]
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1574)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:138)
at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:949)
... 9 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to connect to
node (is node still alive?). Make sure that each GridComputeTask and
GridCacheTransaction has a timeout set in order to prevent parties from waiting
forever in case of network issues [nodeId=--0001--0001,
addrs=[/0:0:0:0:0:0:0:1:47100, /127.0.0.1:47100, /10.0.0.102:47100]]
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1842)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:1671)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:1612)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$4000(TcpCommunicationSpi.java:140)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$RecoveryWorker.body(TcpCommunicationSpi.java:2452)
at