Hello,

I'm running a 2.9.0 cluster with 2 nodes. I tried to use grid grain's
ControlCenterAgent to investigate a slowdown.

When I removed the agent files from server (I don't like to have to put
it in all clients), the second node cannot join the cluster when I
start it.

If I start node A, then node B, node B fails, but if I start node B,
then node A, node A fails.

If I put the agent files back, then all nodes can start, but clients
fail because they don't have the agent classes themselves.

When a node fails to start, it prints this log :


[17:52:45,265][INFO][tcp-disco-sock-reader-[2f3f6f3a 
192.168.43.29:39675]-#6%ClusterWA%-#50%ClusterWA%][TcpDiscoverySpi] Initialized 
connection with remote server node 
[nodeId=2f3f6f3a-accb-4708-a5cc-26d324a07816, rmtAddr=/192.168.43.29:39675]
[17:52:45,268][SEVERE][main][IgniteKernal%ClusterWA] Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=o.a.i.i.managers.discovery.GridDiscoveryManager]
class org.apache.ignite.IgniteCheckedException: Failed to start SPI: 
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller 
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@39a8e2fa], 
reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, 
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
skipAddrsRandomization=false]
        at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
        at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:967)
        at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1935)
        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1298)
        at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2046)
        at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1698)
        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1114)
        at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1032)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:918)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:817)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:687)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
        at org.apache.ignite.Ignition.start(Ignition.java:353)
        at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:300)
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal 
key=metastorage.cluster.id.tag
        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2018)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1189)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:462)
        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2120)
        at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
        ... 13 more
[17:52:45,271][SEVERE][main][IgniteKernal%ClusterWA] Got exception while 
starting (will rollback startup routine).
class org.apache.ignite.IgniteCheckedException: Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
        at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1940)
        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1298)
        at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2046)
        at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1698)
        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1114)
        at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1032)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:918)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:817)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:687)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
        at org.apache.ignite.Ignition.start(Ignition.java:353)
        at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:300)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI: 
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller 
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@39a8e2fa], 
reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, 
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
skipAddrsRandomization=false]
        at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
        at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:967)
        at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1935)
        ... 11 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal 
key=metastorage.cluster.id.tag
        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2018)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1189)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:462)
        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2120)
        at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
        ... 13 more
[17:52:45,271][INFO][tcp-disco-sock-reader-[2f3f6f3a 
192.168.43.29:39675]-#6%ClusterWA%-#50%ClusterWA%][TcpDiscoverySpi] Finished 
serving remote node connection [rmtAddr=/192.168.43.29:39675, rmtPort=39675

And the running node has this :

[17:52:45,223][INFO][tcp-disco-sock-reader-[9a3233c6 
192.168.43.30:54951]-#4%ClusterWA%-#55%ClusterWA%][TcpDiscoverySpi] Finished 
serving remote node connection [rmtAddr=/192.168.43.30:54951, rmtPort=54951
[17:52:45,246][INFO][tcp-disco-msg-worker-[crd]-#2%ClusterWA%-#46%ClusterWA%][GridEncryptionManager]
 Joining node doesn't have stored group keys 
[node=9a3233c6-3a6c-4be0-b5e7-19cdff30f69e]
[17:52:45,266][WARNING][disco-pool-#56%ClusterWA%][TcpDiscoverySpi] Unable to 
unmarshal key=metastorage.cluster.id.tag

If I start the nodes in the reverse order, it has this :

[17:56:52,426][INFO][tcp-disco-sock-reader-[4b8b92f5 
192.168.43.29:42557]-#4%ClusterWA%-#53%ClusterWA%][TcpDiscoverySpi] Finished 
serving remote node connection [rmtAddr=/192.168.43.29:42557, rmtPort=42557
[17:56:52,446][INFO][tcp-disco-msg-worker-[crd]-#2%ClusterWA%-#46%ClusterWA%][GridEncryptionManager]
 Joining node doesn't have stored group keys 
[node=4b8b92f5-1753-4b1b-9902-476c925fa49d]
[17:56:52,466][WARNING][disco-pool-#54%ClusterWA%][TcpDiscoverySpi] Unable to 
unmarshal key=metastorage.cluster.id.tag

Is there a way to recover ?

Thanks,

-- 
Bastien Durel
DATA
Intégration des données de l'entreprise,
Systèmes d'information décisionnels.

bastien.du...@data.fr
tel : +33 (0) 1 57 19 59 28
fax : +33 (0) 1 57 19 59 73
45 avenue Carnot, 94230 CACHAN France
www.data.fr


Reply via email to