[ https://issues.apache.org/jira/browse/IGNITE-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ilya Kasnacheev updated IGNITE-8633: ------------------------------------ Attachment: 8633.zip > Node fails to bail out of wrong BLT, instead hanging around indefinitely > ------------------------------------------------------------------------ > > Key: IGNITE-8633 > URL: https://issues.apache.org/jira/browse/IGNITE-8633 > Project: Ignite > Issue Type: Bug > Affects Versions: 2.4 > Reporter: Ilya Kasnacheev > Assignee: Stanislav Lukyanov > Priority: Major > Attachments: 8633.zip > > > Follow-up on > https://stackoverflow.com/questions/50234056/how-to-give-multiple-static-ip-in-apache-ignite-cache-configuration-xml-file/50270676?noredirect=1#comment88095814_50270676 > but not quite the same. > I have three nodes: A, B and C. > I've started A and C and performed activation. > Then I stopped them both, started B and performed activation on it. > Now I have two BlT clusters: (A, C) and (B) > However, when I start B; and then try to launch nodes A or C I get > inconsistent behavior: > When I launch C, I get the error: > {code} > org.apache.ignite.spi.IgniteSpiException: BaselineTopology of joining node > (8c1e210f-52bb-424f-9c7c-a2e7b1bab546 ) is not compatible with > BaselineTopology in the cluster. Branching history of cluster BlT > ([-1349069127]) doesn't contain branching point hash of joining node BlT > (631694798). Consider cleaning persistent storage of the node and adding it > to the cluster again. > {code} > But when I launch A, it never enters topology, but also never fails. > Moreover, A and B will ping pong each other for eternity: > {code} > [20:16:38,596][WARNING][main][TcpDiscoverySpi] Node has not been connected to > topology and will repeat join process. Check remote nodes logs for possible > error messages. Note that large topology may require significant time to > start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if > getting this message on the starting nodes [networkTimeout=5000] > [20:17:29,514][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/172.25.1.36, rmtPort=49030] > [20:17:29,522][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=49030] > [20:17:29,523][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030] > [20:17:29,524][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Received > ping request from the remote node > [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:49030, > rmtPort=49030] > [20:17:29,525][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished > writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, > rmtAddr=/172.25.1.36:49030, rmtPort=49030] > [20:17:29,526][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished > serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030 > [20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/172.25.1.36, rmtPort=50857] > [20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=50857] > [20:18:30,733][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857] > [20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Received > ping request from the remote node > [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:50857, > rmtPort=50857] > [20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished > writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, > rmtAddr=/172.25.1.36:50857, rmtPort=50857] > [20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished > serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857 > {code} > {code} > [20:16:28,793][INFO][tcp-disco-msg-worker-#3][GridSnapshotAwareClusterStateProcessorImpl] > Received state change finish message: true > [20:16:28,803][INFO][exchange-worker-#62][time] Finished exchange init > [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], crd=true] > [20:16:28,812][INFO][exchange-worker-#62][GridCachePartitionExchangeManager] > Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion > [topVer=1, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, > node=37104137-a21e-4b6f-a70b-09164300bbfc] > [20:16:28,818][INFO][sys-#68][GridSnapshotAwareClusterStateProcessorImpl] > Successfully performed final activation steps > [nodeId=37104137-a21e-4b6f-a70b-09164300bbfc, client=false, > topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1]] > [20:16:33,571][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/172.25.1.35, rmtPort=42500] > [20:16:33,579][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42500] > [20:16:33,580][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500] > [20:16:33,592][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Finished > serving remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500 > [20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/172.25.1.35, rmtPort=42714] > [20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42714] > [20:16:39,802][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714] > [20:16:39,806][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Finished > serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714 > {code} > I don't think this is expected behaviour. I will attach config and work > directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)