[ https://issues.apache.org/jira/browse/IGNITE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Plekhanov updated IGNITE-8785: -------------------------------------- Fix Version/s: (was: 2.9) 2.10 > Node may hang indefinitely in CONNECTING state during cluster segmentation > -------------------------------------------------------------------------- > > Key: IGNITE-8785 > URL: https://issues.apache.org/jira/browse/IGNITE-8785 > Project: Ignite > Issue Type: Bug > Components: cache > Affects Versions: 2.5 > Reporter: Pavel Kovalenko > Priority: Major > Fix For: 2.10 > > > Affected test: > org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest#testTopologyValidatorWithCacheGroup > Node hangs with following stacktrace: > {noformat} > "grid-starter-testTopologyValidatorWithCacheGroup-22" #117619 prio=5 > os_prio=0 tid=0x00007f17dd19b800 nid=0x304a in Object.wait() > [0x00007f16b19df000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:931) > - locked <0x0000000705ee4a60> (a java.lang.Object) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948) > at > org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915) > at > org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1739) > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1046) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) > - locked <0x0000000705995ec0> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:649) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:882) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:845) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:833) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:799) > at > org.apache.ignite.testframework.junits.GridAbstractTest$3.call(GridAbstractTest.java:742) > at > org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) > {noformat} > It seems that node never receives acknowledgment from coordinator. > There were some failure before: > {noformat} > [org.apache.ignite:ignite-core] [2018-06-10 04:59:18,876][WARN > ][grid-starter-testTopologyValidatorWithCacheGroup-22][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi] > Node has not been connected to topology and will repeat join process. Check > remote nodes logs for possible error messages. Note that large topology may > require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' > configuration property if getting this message on the starting nodes > [networkTimeout=5000] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)